Datavant is the data collaboration platform trusted for healthcare, dedicated to making the world's health data secure, accessible, and actionable. As a Staff Data Engineer, you will lead the design and build of a next-generation patient data platform, focusing on developing distributed data systems and platform capabilities that support secure and intelligent data use across a multi-tenant, multi-cloud environment.
Responsibilities:
- Lead the architecture and development of core data platform capabilities, including processing frameworks, storage patterns, and shared services
- Design and implement multi-tenant, multi-cloud data systems with strong isolation, scalability, and operational durability
- Build and operate large-scale distributed data processing systems across batch and real-time workloads
- Define and evolve data lifecycle patterns, including ingestion, validation, transformation, enrichment, and serving
- Establish data quality gates and validation frameworks to ensure trust, consistency, and auditability
- Design systems that integrate with platform infrastructure, including CI/CD, deployment orchestration, observability, and infrastructure automation
- Make sound architectural decisions across performance, cost, reliability, and maintainability tradeoffs
- Lead ambiguous, high-impact initiatives where both problem definition and solution design require ownership
- Contribute significantly to production code, setting standards for quality, testing, and operability
Requirements:
- 10+ years of experience building data-intensive or distributed systems, with a strong software engineering foundation
- Proven experience designing and operating large-scale data platforms in production
- Deep expertise in distributed data processing systems (e.g., Spark or similar big data technologies)
- Strong software engineering fundamentals, including system design, testing, CI/CD, and production debugging
- Experience building systems in cloud environments (AWS preferred), including storage, compute, and security patterns
- Experience designing multi-tenant systems, with a focus on isolation, scalability, and reliability
- Strong understanding of data modeling, pipeline design, and data quality enforcement
- Ability to navigate ambiguity, evaluate tradeoffs, and drive durable technical decisions
- Track record of being a high-impact, hands-on contributor who leads through both design and execution
- Experience building data systems that support AI-driven use cases, including low-latency data access patterns, feature generation and ML data pipelines, iterative, feedback-driven data workflows
- Familiarity with agentic or AI-assisted coding tools, and the ability to leverage them to improve development velocity and code quality
- Comfort operating in environments where AI augments both system design and development workflows
- Experience in regulated environments (e.g., healthcare, finance)
- Familiarity with interoperability standards (e.g., FHIR, HL7, or similar)
- Experience leading large-scale platform migrations or architectural transformations