Definitive Healthcare is a company focused on transforming data and analytics into meaningful intelligence for the healthcare sector. They are seeking a Senior Machine Learning Engineer to lead the design and implementation of AI/ML systems that enhance customer experience and operational efficiency.
Responsibilities:
- Lead the design and implementation of scalable, production-grade ML systems in cloud environments with a focus on performance, reliability, and reproducibility
- Collaborate with product managers and senior stakeholders to define and prioritize ML initiatives aligned with business goals
- Oversee the architecture and evolution of data pipelines for multi-terabyte datasets, ensuring efficiency and reliability
- Guide the development of high-impact features and label sets across diverse domains such as healthcare and consumer analytics
- Lead experimentation strategy, including design of A/B tests, advanced validation methods, and lifecycle management using tools like MLflow and Databricks
- Drive continual model improvement through advanced techniques such as automated retraining, model decay analysis, and bias mitigation
- Champion rapid prototyping and proof-of-concept development to evaluate emerging technologies and ML techniques
- Lead technical explorations into new ML architectures (e.g., foundation models, causal inference, time series deep learning)
- Serve as a technical leader and trusted advisor, working closely with product, engineering, data, and executive teams to shape end-to-end ML solutions
- Set standards for code quality, performance, and documentation, and mentor junior engineers in best practices
Requirements:
- Bachelor's or Master's degree in Computer Science, Machine Learning, Data Science, or a related field (or equivalent practical experience)
- 5+ years of industry experience as an ML Engineer, Data Scientist, or Data Engineer, with a focus on deploying and scaling ML systems
- Deep expertise in Python, SQL, and PySpark for distributed data processing, with proficiency in libraries like scikit-learn, PyTorch, and XGBoost
- Proven experience designing robust ML pipelines, leveraging tools like MLflow or equivalent
- Strong command of ML frameworks (e.g., scikit-learn, TensorFlow, XGBoost, PyTorch)
- Hands-on experience deploying models in cloud-based environments (AWS, GCP, Azure, and Databricks)
- Proven ability to manage end-to-end ML lifecycles at scale, including data ingestion, training, evaluation, deployment, and monitoring
- Excellent communication skills and demonstrated ability to influence cross-functional teams
- Experience working with healthcare claims, EHR, or life sciences datasets
- Advanced degree (M.S. or Ph.D.) in Computer Science, Data Science, or related technical field
- Strong knowledge of MLOps practices including CI/CD for ML, automated retraining, and model versioning
- Experience with deep learning architectures for time series forecasting, sequential data, or hierarchical modeling
- Proficient in designing evaluation protocols and defining performance metrics to rigorously assess model effectiveness and drive data-driven decision-making
- Comfortable operating in fast-paced, high-ownership environments, and able to prioritize multiple high-impact projects