Brooksource is seeking a hands-on, mid-level Data Engineer to support a security-focused data lake initiative. This role involves building reliable API ingestion pipelines, creating structured SQL datasets, and preparing data for downstream analytics and future AI use cases.

Responsibilities:

Build and own API-based ingestion pipelines pulling data from third-party systems
Handle authentication, pagination, retries, failures, and schema changes
Ingest, transform, and structure raw data into SQL-based datasets
Validate incoming data for accuracy, completeness, and consistency
Create and maintain clear documentation for datasets, fields, and ingestion logic
Manage and support datasets across multiple data sources (approx. 14 total)
Prepare clean, well-structured data for downstream analytics and AI consumption
Operate independently and take ownership of deliverables with minimal hand-holding

Requirements:

2+ years of hands-on experience writing and maintaining API ingestion pipelines in a big data environment, managing datasets across 14 known data sources and supporting datasets with 10–15K+ fields
3–6 years of hands-on experience as a Data Engineer or similar role
Strong experience building API-based ingestion pipelines
Proficiency with SQL / SQL Server and relational data modeling
Experience working in AWS (S3, Lambda, EC2, RDS, or similar)
Ability to validate and QA large, complex datasets
Comfortable working with large field counts and evolving schemas
Strong documentation and communication skills
Exposure to security, device, endpoint, or asset data
Experience with tools such as Qualys, CrowdStrike, Intune, JAMF, Wiz, or SCCM
Familiarity with data lake concepts (raw vs curated layers)
Experience supporting analytics or AI teams (data prep only, not model building)

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: