Evolent is a company that partners with health plans and providers to improve healthcare outcomes for complex health conditions. They are seeking an ML/LLM Operations Engineer to ensure AI systems deliver reliable and compliant results in healthcare, working closely with Data Science, Engineering, and other cross-functional teams.

Responsibilities:

Develop and maintain standardized evaluation frameworks to consistently measure LLM performance across relevant healthcare metrics
Build monitoring systems using Logfire to track AI model performance, detect drift, and alert the team to anomalies
Create testing infrastructure for prompt versions, model selection, and quality assurance processes
Design and implement audit sampling processes for continuous quality monitoring and clinical review workflows
Oversee regulatory compliance processes, including documentation for bias assessments, model cards, and audit trails required in healthcare
Optimize LLM operations through intelligent model selection, prompt engineering, and cost management strategies
Support the transition from successful POCs to production-ready services with appropriate testing and validation
Partner with DevOps on infrastructure requirements while focusing on AI-specific monitoring and optimization
Create and maintain documentation, runbooks, and operational procedures for all deployed AI systems
Collaborate with Clinical Support Liaison to incorporate clinical feedback into system improvements
Prepare regular reports on AI system quality, performance metrics, and compliance status

Requirements:

Bachelor's or master's degree in computer science, data science, or related field
2+ years of experience with Python development and at least one production LLM implementation
Strong proficiency in SQL for complex log analysis and metrics generation
Demonstrated experience with LLM APIs and frameworks (experience with PydanticAI, LangChain, or similar)
Experience with monitoring tools and practices for AI systems, including performance metrics, drift detection, and alerting
Understanding of LLM behavior, prompt engineering, and common failure modes in production
Experience building evaluation or testing frameworks for AI/ML systems
Strong communication skills for cross-functional collaboration
Experience with healthcare AI applications and compliance requirements
Familiarity with multiple LLM providers (OpenAI, Anthropic, Google, Azure)
Knowledge of Pydantic ecosystem including PydanticAI and Logfire
Understanding of LLM evaluation metrics and methodologies
Experience building tools for non-technical users
Basic knowledge of containerization (Docker) for local testing and development
Experience with cloud environments (AWS, Azure) as a user
Understanding of API rate limiting, quota management, and cost optimization strategies
Knowledge of CI/CD concepts for ML model deployments
Experience with regulatory compliance and audit processes
Excellent documentation skills and attention to detail

ML/LLM Operations Engineer

Key skills

About this role

Responsibilities:

Requirements: