H1 is a company dedicated to improving healthcare access through innovative data solutions. They are seeking a Staff Data Engineer to lead high-visibility projects and ensure the delivery of scalable and reliable data architectures for critical data assets.

Responsibilities:

Act as a self-starter who drives execution independently, taking ownership and initiative with minimal need for day-to-day direction
Lead high-visibility RWE projects, starting with claims data, and keep multiple initiatives moving by proactively unblocking teams
Own the end-to-end architecture for critical data assets, ensuring solutions are scalable, reliable, and aligned with H1’s long-term vision
Design, build, and optimize large-scale data pipelines (hundreds of TBs) for performance, reliability, and cost efficiency
Partner with Product, Data Science, and downstream engineering teams to align priorities, manage dependencies, and deliver high-value outcomes
Represent engineering in cross-functional forums, shaping roadmaps and reducing reliance on senior leadership for day-to-day decisions
Develop deep domain expertise and mentor other engineers, helping raise the technical bar and influence the evolution of our data products

Requirements:

8+ years as a software, data, or backend engineer building and operating scalable, production-grade systems
Experience with large-scale data processing (e.g., Spark/PySpark on EMR or similar) or scalable distributed backend systems, with the ability to quickly deepen expertise in our data stack (PySpark, EMR, Hudi/Delta)
Strong proficiency in SQL, including writing and optimizing complex queries over large datasets
Strong programming experience in Python (or a modern language with the ability to quickly ramp up in Python)
Experience designing systems or large-scale datasets/pipelines with attention to performance, reliability, and maintainability
Hands-on experience with modern engineering workflows and tooling such as Git, JIRA, and CI/CD systems (e.g., CircleCI)
Comfort deploying and troubleshooting distributed workloads in cloud environments such as AWS EMR or Kubernetes
Experience with workflow orchestration or job scheduling tools (e.g., Airflow, Argo)
Demonstrated ability to independently drive complex, cross-team technical initiatives and influence stakeholders without formal authority
Experience with streaming/messaging technologies (e.g., Kafka, Kinesis) nice to have
Background in RWE, healthcare data, or other complex/regulated data domains is preferred
Experience using AI-assisted coding tools (e.g., GitHub Copilot, Claude Code) to accelerate development while maintaining quality is encouraged

Staff Data Engineer

Key skills

About this role

Responsibilities:

Requirements: