Optum is a global organization that delivers care aided by technology to help millions of people live healthier lives. The Lead AI/ML Engineer will drive the design, development, and operationalization of AI/ML solutions to improve the reliability and clinical impact of workflows, while mentoring engineers and collaborating with various teams.
Responsibilities:
- Lead the architecture and implementation of scalable AI/ML solutions integrated into the OCM ecosystem (APIs, event streams, workflow engines, and integration layers)
- Own end-to-end ML lifecycle: problem framing, feature engineering, model development, validation, deployment, monitoring, drift detection, retraining strategy
- Establish best practices for MLOps: CI/CD for ML, model registries, automated evaluation gates, reproducible training, and secure deployment patterns
- Build production-grade inference services (real-time and batch) with clear SLOs, instrumentation, and rollback strategies
- Define and enforce data governance for ML features and training datasets (quality checks, lineage, documentation)
- Partner with product and clinical stakeholders to identify high-impact use cases and translate them into measurable outcomes (quality, productivity, stability, member/patient impact)
- Embed AI into workflows responsibly, with explainability, auditing, and human-in-the-loop guardrails
- Implement ML monitoring (performance, drift, bias checks where applicable) and integrate signals into operational dashboards and alerting
- Ensure solutions meet security and compliance needs (PHI/PII protection, least-privilege access, auditability)
- Drive responsible AI practices: evaluation transparency, documentation, risk assessment, and safe deployment patterns
- Mentor and guide ML engineers and software engineers-raising the bar on engineering quality, design rigor, and operational excellence
- Lead technical design reviews, influence platform direction, and align teams across engineering, data, operations, and product
- Act as a team player: unblock others, foster shared ownership, and improve execution predictability
Requirements:
- Bachelor's Degree in Computer Science, Data Science, Artificial Intelligence, Machine Learning, Engineering, or a related STEM field
- 10+ years of software engineering experience with 3+ years building and deploying ML systems into production
- Proven hands-on experience delivering end-to-end ML solutions (data model deployment monitoring iteration)
- Experience building API-based inference services and data pipelines in cloud-native environments (containerization, orchestration, CI/CD)
- Experience collaborating across functions (product, operations, data, security, compliance) and translating needs into technical solutions
- Solid skills in Python and modern ML libraries (e.g., PyTorch, TensorFlow, scikit-learn), plus strong software engineering fundamentals
- Expertise in MLOps practices: model versioning, reproducibility, automated testing/validation, monitoring, drift detection
- Solid understanding of data engineering concepts (feature stores, streaming/batch processing, data quality checks, lineage)
- Proven solid leadership behaviors: mentoring, influencing without authority, driving clarity, and executing with accountability
- Proven excellent communication skills-can explain complex ML concepts to non-ML stakeholders and align on measurable outcomes
- Master's Degree in Computer Science, Data Science, Artificial Intelligence, Machine Learning, Engineering, or a related STEM field
- Experience with healthcare systems and workflows (care management, utilization management, clinical operations) and/or working with PHI/PII in regulated environments (HIPAA-aligned controls)
- Familiarity with clinical data standards and patterns (claims/encounters, care plans, HL7/FHIR concepts-where relevant)
- Experience with LLMs / GenAI for enterprise use cases (summarization, classification, retrieval, workflow copilots), including: RAG architectures, evaluation frameworks, prompt/version control, safety guardrails
- Applied experience in one or more areas: Anomaly detection and time-series modeling, Ranking/recommendation systems, NLP for clinical/operational text, Causal inference / uplift modeling for operational optimization
- Experience with observability platforms and building ML-driven alerting/noise reduction (AIOps)
- Experience designing event-driven architectures (e.g., Kafka-style streaming), feature computation at scale, and real-time decisioning
- Experience with security-by-design and governance (model documentation, audit trails, approvals)
- Experience leading technical roadmaps, shaping platform standards, and coordinating across multiple teams
- Track record of establishing ML engineering standards (coding practices, model review process, reusable components)