Datavant is the data collaboration platform trusted for healthcare, providing critical data solutions for organizations across the healthcare ecosystem. As a Staff Data Engineer, you will lead the design and build of the next-generation patient data platform, focusing on developing distributed data systems and platform capabilities to enhance the secure and intelligent use of data.

Responsibilities:

Lead the architecture and development of core data platform capabilities, including processing frameworks, storage patterns, and shared services
Design and implement multi-tenant, multi-cloud data systems with strong isolation, scalability, and operational durability
Build and operate large-scale distributed data processing systems across batch and real-time workloads
Define and evolve data lifecycle patterns, including ingestion, validation, transformation, enrichment, and serving
Establish data quality gates and validation frameworks to ensure trust, consistency, and auditability
Design systems that integrate with platform infrastructure, including CI/CD, deployment orchestration, observability, and infrastructure automation
Make sound architectural decisions across performance, cost, reliability, and maintainability tradeoffs
Lead ambiguous, high-impact initiatives where both problem definition and solution design require ownership
Contribute significantly to production code, setting standards for quality, testing, and operability

Requirements:

10+ years of experience building data-intensive or distributed systems, with a strong software engineering foundation
Proven experience designing and operating large-scale data platforms in production
Deep expertise in distributed data processing systems (e.g., Spark or similar big data technologies)
Strong software engineering fundamentals, including system design, testing, CI/CD, and production debugging
Experience building systems in cloud environments (AWS preferred), including storage, compute, and security patterns
Experience designing multi-tenant systems, with a focus on isolation, scalability, and reliability
Strong understanding of data modeling, pipeline design, and data quality enforcement
Ability to navigate ambiguity, evaluate tradeoffs, and drive durable technical decisions
Track record of being a high-impact, hands-on contributor who leads through both design and execution
Strong candidates will have experience with several of the following: Distributed data processing frameworks (e.g., Spark, Flink, or similar), Cloud data platforms (e.g., Databricks, Snowflake, or equivalent), Data transformation and modeling frameworks (dbt or equivalent), Workflow orchestration systems (e.g., Airflow or similar), Streaming and event-driven systems (e.g., Kafka or equivalent), Infrastructure-as-code (e.g., Terraform), Modern table formats and lakehouse architectures (e.g., Iceberg, Delta, or similar)
Experience building data systems that support AI-driven use cases, including low-latency data access patterns, feature generation and ML data pipelines, iterative, feedback-driven data workflows
Familiarity with agentic or AI-assisted coding tools, and the ability to leverage them to improve development velocity and code quality
Comfort operating in environments where AI augments both system design and development workflows
Experience in regulated environments (e.g., healthcare, finance)
Familiarity with interoperability standards (e.g., FHIR, HL7, or similar)
Experience leading large-scale platform migrations or architectural transformations

Staff Data Engineer, Platform Engineering

Key skills

About this role

Responsibilities:

Requirements: