Cohort AI is seeking a Director Data Engineer who will be a hands-on technical expert responsible for designing, building, and optimizing the core data platform. The role focuses on transforming complex healthcare data into actionable insights and ensuring the integrity and performance of large-scale clinical datasets.
Responsibilities:
- Pipeline Development: Design, build, and maintain scalable and reliable data pipelines (ETL/ELT) for the ingestion and processing of large volumes of customer clinical data
- Data Transformation: Directly implement code to efficiently map diverse customer datasets into our common data models (e.g., OMOP, FHIR), ensuring data fidelity and consistency
- Architecture & Optimization: Identify and implement technical improvements to the data engineering architecture, including optimizing distributed data processing and cloud resource utilization for cost and performance
- Quality & Governance: Develop and embed advanced data quality checks, monitoring, and validation frameworks to maintain the highest standards of data reliability in clinical datasets
- Technical Collaboration: Partner with Software Engineering and Data Science teams to translate complex business requirements into robust, scalable technical solutions and data models
- Mentorship & Standards: Act as a technical leader, establishing coding standards, performing code reviews, and mentoring mid-level engineers on deep technical subjects
Requirements:
- Bachelor's degree in Computer Science, Data Engineering, or a related field; Master's degree preferred
- 8+ years of hands-on experience in data engineering, focused on large-scale data systems
- Expert-level proficiency in SQL (any dialect) and Python, with deep experience in cloud platforms such as AWS, Azure, or GCP
- Extensive, proven track record of working hands-on with healthcare data, including advanced knowledge of relevant standards and data models (e.g., FHIR, OMOP)
- Deep technical mastery of distributed data processing and streaming frameworks like PySpark, and experience with workflow orchestration tools (e.g., Airflow, Dagster)
- Master's degree