Brooksource is seeking a hands-on, mid-level Data Engineer to support a security-focused data lake initiative. This role involves building reliable API ingestion pipelines, creating structured SQL datasets, and preparing data for downstream analytics and future AI use cases.
Responsibilities:
- Build and own API-based ingestion pipelines pulling data from third-party systems
- Handle authentication, pagination, retries, failures, and schema changes
- Ingest, transform, and structure raw data into SQL-based datasets
- Validate incoming data for accuracy, completeness, and consistency
- Create and maintain clear documentation for datasets, fields, and ingestion logic
- Manage and support datasets across multiple data sources (approx. 14 total)
- Prepare clean, well-structured data for downstream analytics and AI consumption
- Operate independently and take ownership of deliverables with minimal hand-holding
Requirements:
- 2+ years of hands-on experience writing and maintaining API ingestion pipelines in a big data environment, managing datasets across 14 known data sources and supporting datasets with 10–15K+ fields
- 3–6 years of hands-on experience as a Data Engineer or similar role
- Strong experience building API-based ingestion pipelines
- Proficiency with SQL / SQL Server and relational data modeling
- Experience working in AWS (S3, Lambda, EC2, RDS, or similar)
- Ability to validate and QA large, complex datasets
- Comfortable working with large field counts and evolving schemas
- Strong documentation and communication skills
- Exposure to security, device, endpoint, or asset data
- Experience with tools such as Qualys, CrowdStrike, Intune, JAMF, Wiz, or SCCM
- Familiarity with data lake concepts (raw vs curated layers)
- Experience supporting analytics or AI teams (data prep only, not model building)