As a Staff Data Engineer at Imagine Pediatrics, you will be the first dedicated Data Engineer on a hybrid team with Analytics Engineers, responsible for defining how data moves through our platform and owning the data pipelines that power clinical analytics, operational reporting, and external integrations.
You will ensure that data ingestion and integration decisions are made with a clear understanding of downstream analytical usage, including how data freshness, grain, and structure impact downstream processes and systems.
You will partner closely with Analytics Engineers, Product Engineers and Platform Engineers to deliver a platform built for a high-growth, mission-driven healthcare organization.
Design, build, and maintain scalable ELT pipelines that ingest data from clinical systems, APIs, and third-party integrations.
Architect and manage event-driven data pipelines in AWS — including cross-account configurations and dead-letter queue handling.
Write and maintain infrastructure-as-code to deploy and manage data ingestion workloads, primarily extending existing modules and patterns.
Orchestrate pipeline execution and monitoring using Dagster, ensuring observability and reliability across all workflows.
Implement data quality checks, alerting, and lineage tracking across the pipeline.
Identify and eliminate systemic failure modes in pipelines, improving reliability through long-term fixes rather than repeated incident remediation.
Partner with Analytics Engineers to ensure upstream data supports correct and consistent downstream models.
Set technical direction for data architecture and mentor other engineers.
Requirements
7–10+ years of data engineering or platform engineering experience, including at least 2+ years in a senior or staff-level role owning production data systems.
Strong experience designing data pipelines using Python and SQL.
Strong experience with AWS services including Lambda, SQS, SNS, and S3.
Strong experience building event-driven and API-based ingestion systems (e.g., webhooks, asynchronous processing, or CDC patterns).
Experience with data orchestration tools such as Dagster (or similar).
Experience working with infrastructure-as-code (Terraform), primarily extending and adapting existing modules and patterns.
Experience with cloud data warehouses, preferably Snowflake, including performance-aware SQL development.
Proficiency in at least one scripting language beyond SQL and Python (JavaScript, TypeScript, or Go) for automation, tooling, or serverless functions.
Demonstrated use of modern software engineering practices including version control, CI/CD, testing, and code review.
Proven ability to troubleshoot complex data and infrastructure issues across multiple systems and clearly communicate findings to both technical and non-technical stakeholders.
Proven ability to reason about downstream analytical impact of data pipeline design, including data freshness, grain, and transformation behavior.
Experience working closely with analytics engineering, data modeling, or similar downstream consumers of data.
Tech Stack
AWS
Cloud
JavaScript
Python
SQL
Terraform
TypeScript
Go
Benefits
Competitive medical, dental, and vision insurance
Healthcare and Dependent Care FSA; Company-funded HSA
401(k) with 4% match, vested 100% from day one
Employer-paid short and long-term disability
Life insurance at 1x annual salary
20 days PTO + 10 Company Holidays & 2 Floating Holidays