Imagine Pediatrics is a tech-enabled, pediatrician-led medical group focused on enhancing care for children with special health care needs. As a Staff Data Engineer, you will define data movement through the platform and manage data pipelines that support clinical analytics and operational reporting, collaborating closely with various engineering teams.

Responsibilities:

Design, build, and maintain scalable ELT pipelines that ingest data from clinical systems, APIs, and third-party integrations utilizing webhook-based, API-based, and CDC (change data capture) approaches
Architect and manage event-driven data pipelines in AWS — including cross-account configurations and dead-letter queue handling
Write and maintain infrastructure-as-code to deploy and manage data ingestion workloads, primarily extending existing modules and patterns
Orchestrate pipeline execution and monitoring using Dagster, ensuring observability and reliability across all workflows
Implement data quality checks, alerting, and lineage tracking across the pipeline
Identify and eliminate systemic failure modes in pipelines, improving reliability through long-term fixes rather than repeated incident remediation
Partner with Analytics Engineers to ensure upstream data supports correct and consistent downstream models
Set technical direction for data architecture and mentor other engineers

Requirements:

7–10+ years of data engineering or platform engineering experience, including at least 2+ years in a senior or staff-level role owning production data systems
Strong experience designing data pipelines using Python and SQL
Strong experience with AWS services including Lambda, SQS, SNS, and S3
Strong experience building event-driven and API-based ingestion systems (e.g., webhooks, asynchronous processing, or CDC patterns)
Experience with data orchestration tools such as Dagster (or similar)
Experience working with infrastructure-as-code (Terraform), primarily extending and adapting existing modules and patterns
Experience with cloud data warehouses, preferably Snowflake, including performance-aware SQL development
Proficiency in at least one scripting language beyond SQL and Python (JavaScript, TypeScript, or Go) for automation, tooling, or serverless functions
Demonstrated use of modern software engineering practices including version control, CI/CD, testing, and code review
Proven ability to troubleshoot complex data and infrastructure issues across multiple systems and clearly communicate findings to both technical and non-technical stakeholders
Proven ability to reason about downstream analytical impact of data pipeline design, including data freshness, grain, and transformation behavior
Experience working closely with analytics engineering, data modeling, or similar downstream consumers of data
Experience designing or managing IAM policies and least-privilege access models across data platform services
Experience with dbt or modern analytics engineering workflows
Experience working with healthcare data, FHIR resources, or clinical systems
Familiarity with HIPAA compliance and handling of PHI in cloud environments
Experience with high-volume ingestion systems including webhook-based tools (e.g., Hevo, Fivetran, or similar)
Experience driving the adoption of AI tools to improve engineering productivity
Exposure to real-world evidence (RWE), health economics and outcomes research (HEOR), or similar evidence-generation programs

Staff Data Engineer

Key skills

About this role

Responsibilities:

Requirements: