Komodo Health is dedicated to reducing the global burden of disease by leveraging data. They are seeking a Staff Data Engineer to architect and deliver foundational data products and serving capabilities, making complex healthcare data usable for customers and applications.
Responsibilities:
- Architect, build, and deliver scalable Healthcare Map data products that power direct customer use cases, APIs, analytics surfaces, serving layers, and internal applications
- Design and implement high-performance data processing and serving patterns across large-scale healthcare datasets, using the right tools for the problem across SQL, Python, Spark, Rust, C++, and emerging AI-enabled engineering workflows
- Create shared data models, productized datasets, reusable libraries, and technical standards that become the foundation for downstream product, analytics, and application teams
- Build data products that are easy to consume through APIs, serving layers, exports, analytics environments, and customer-facing delivery mechanisms
- Partner with Product, Data Science, Quality, Platform, and application teams to translate complex healthcare use cases into production-grade technical designs and execution plans
- Lead complex, multi-quarter initiatives, making clear trade-offs across performance, scalability, maintainability, cost, reliability, and time-to-market
- Define and implement data quality checks, validation frameworks, observability, lineage, monitoring, and alerting to ensure Healthcare Map products are accurate, explainable, and reliable
- Raise the bar for system design, code quality, documentation, testing, CI/CD, and operational readiness across the team
- Mentor engineers through design reviews, technical deep dives, pairing, and architectural guidance, helping the team make better decisions at scale
Requirements:
- Data Product Engineering Expertise: Extensive experience building production-grade, large-scale data products, services, and analytical systems that serve real customer and business use cases
- Modern Data Systems Depth: Strong technical depth across SQL, distributed data processing, cloud data platforms, MPP databases, and high-scale compute frameworks such as Spark, Python, Rust, C++, or equivalent technologies
- Architecture & Systems Design: Demonstrated ability to design data models, serving patterns, platform components, and system architectures for complex, high-volume data environments
- Healthcare Data Product Judgment: Ability to reason through data quality, identity, longitudinal patient journeys, claims or clinical data complexity, and downstream consumption needs
- AI & ML Enablement: Experience designing data workflows, feature pipelines, evaluation datasets, or infrastructure that supports AI/ML training, inference, experimentation, and monitoring
- Analytical & Statistical Rigor: Strong ability to use data analysis, statistical reasoning, hypothesis testing, and experimental design to validate product quality and business impact
- Technical Communication: Ability to explain technical decisions, trade-offs, risks, and delivery status clearly to engineers, product partners, data scientists, and senior stakeholders
- AI-Augmented Engineering: Ability to use AI tools such as ChatGPT, Gemini, Cursor, Claude, or similar systems to improve engineering productivity, design quality, testing, documentation, and decision-making
- Healthcare Data Experience: Experience with claims, clinical, RWE, provider, patient, or life sciences data, including familiarity with coding systems such as ICD-10, CPT, NDC, RxNorm, NPI, or taxonomy data
- Data Product Delivery: Experience building and operating data products that are consumed by customers, analytics users, APIs, applications, or serving layers
- High-Scale Data Architecture: Experience designing systems for large-volume data processing, productization, versioning, delivery, performance optimization, and cost efficiency
- Applied AI / Agentic Workflows: Experience using, designing, or integrating AI-enabled workflows to improve engineering productivity, data quality, extraction, curation, testing, or product delivery
- Fast-Growth Execution: Experience operating in high-growth or ambiguous environments where technical leaders must balance architecture, delivery, quality, and speed