Komodo Health is dedicated to reducing the global burden of disease through smarter data use, and they are seeking a Staff Data Engineer to architect and deliver foundational data products. The role involves partnering with various teams to make healthcare data usable at enterprise scale, while driving innovation and technical excellence in data processing and product delivery.
Responsibilities:
- Architect, build, and deliver scalable Healthcare Map data products that power direct customer use cases, APIs, analytics surfaces, serving layers, and internal applications
- Design and implement high-performance data processing and serving patterns across large-scale healthcare datasets, using the right tools for the problem across SQL, Python, Spark, Rust, C++, and emerging AI-enabled engineering workflows
- Create shared data models, productized datasets, reusable libraries, and technical standards that become the foundation for downstream product, analytics, and application teams
- Build data products that are easy to consume through APIs, serving layers, exports, analytics environments, and customer-facing delivery mechanisms
- Partner with Product, Data Science, Quality, Platform, and application teams to translate complex healthcare use cases into production-grade technical designs and execution plans
- Lead complex, multi-quarter initiatives, making clear trade-offs across performance, scalability, maintainability, cost, reliability, and time-to-market
- Define and implement data quality checks, validation frameworks, observability, lineage, monitoring, and alerting to ensure Healthcare Map products are accurate, explainable, and reliable
- Raise the bar for system design, code quality, documentation, testing, CI/CD, and operational readiness across the team
- Mentor engineers through design reviews, technical deep dives, pairing, and architectural guidance, helping the team make better decisions at scale
Requirements:
- Data Product Engineering Expertise: Extensive experience building production-grade, large-scale data products, services, and analytical systems that serve real customer and business use cases
- Modern Data Systems Depth: Strong technical depth across SQL, distributed data processing, cloud data platforms, MPP databases, and high-scale compute frameworks such as Spark, Python, Rust, C++, or equivalent technologies
- Architecture & Systems Design: Demonstrated ability to design data models, serving patterns, platform components, and system architectures for complex, high-volume data environments
- Healthcare Data Product Judgment: Ability to reason through data quality, identity, longitudinal patient journeys, claims or clinical data complexity, and downstream consumption needs
- AI & ML Enablement: Experience designing data workflows, feature pipelines, evaluation datasets, or infrastructure that supports AI/ML training, inference, experimentation, and monitoring
- Analytical & Statistical Rigor: Strong ability to use data analysis, statistical reasoning, hypothesis testing, and experimental design to validate product quality and business impact
- Technical Communication: Ability to explain technical decisions, trade-offs, risks, and delivery status clearly to engineers, product partners, data scientists, and senior stakeholders
- AI-Augmented Engineering: Ability to use AI tools such as ChatGPT, Gemini, Cursor, Claude, or similar systems to improve engineering productivity, design quality, testing, documentation, and decision-making
- Healthcare Data Experience: Experience with claims, clinical, RWE, provider, patient, or life sciences data, including familiarity with coding systems such as ICD-10, CPT, NDC, RxNorm, NPI, or taxonomy data
- Data Product Delivery: Experience building and operating data products that are consumed by customers, analytics users, APIs, applications, or serving layers
- High-Scale Data Architecture: Experience designing systems for large-volume data processing, productization, versioning, delivery, performance optimization, and cost efficiency
- Applied AI / Agentic Workflows: Experience using, designing, or integrating AI-enabled workflows to improve engineering productivity, data quality, extraction, curation, testing, or product delivery
- Fast-Growth Execution: Experience operating in high-growth or ambiguous environments where technical leaders must balance architecture, delivery, quality, and speed