The University of Texas MD Anderson Cancer Center is seeking a Senior Machine Learning Operations Engineer to support enterprise-wide artificial intelligence initiatives within Data Impact & Governance. This role involves building, deploying, and sustaining production-quality machine learning systems while collaborating with various stakeholders to ensure AI solutions are scalable and responsible.

Responsibilities:

Oversee end-to-end AI model lifecycles including training, evaluation, deployment, monitoring, and maintenance of production-quality machine learning models
Design and implement CI/CD pipelines for model training, deployment, monitoring, and retraining with a focus on security, scalability, reliability, reproducibility, and performance
Implement rigorous testing, versioning, and documentation practices to support reproducibility, risk mitigation, and measurable impact
Maintain comprehensive experiment tracking, data lineage, model lineage, and model scorecards
Design fallback, rollback, and decommissioning strategies to ensure operational continuity of AI solutions
Promote responsible AI practices by minimizing bias, enhancing fairness, and maximizing transparency in machine learning models
Ensure AI lifecycle management aligns with institutional standards and best practices
Support assessment, validation, and onboarding of external machine learning models and AI-driven products to minimize organizational risk and maximize value
Develop and maintain scalable data pipelines, feature stores, and artifact management systems
Deploy and operate ML workloads across cloud and on-premises environments including Azure, AWS, or GCP
Utilize containerization and orchestration technologies such as Docker, Kubernetes, and DAG-based tools
Apply DevOps and MLOps tools including Azure DevOps, GitHub Actions, and version control systems
Collaborate with stakeholders to gather requirements, translate AI concepts into understandable terms, and incorporate feedback
Partner with data scientists, ML engineers, and software engineers to integrate models into enterprise systems
Deliver training and knowledge sharing to enhance AI understanding and adoption across the organization
Report project progress, impact, risks, and recommendations to leadership
Stay current with emerging technology trends in AI, MLOps, and healthcare analytics
Contribute to internal and external technical communities
Foster a culture of continuous improvement, innovation, and learning across teams
Perform other duties as assigned

Requirements:

Bachelor's degree in Computer Science, Software Engineering, Data Science, Physics, Math & Statistics, or another related engineering discipline
Five years of experience in machine learning engineering, data science, data engineering, and/or software engineering
With Master's degree, three years' experience required
With PhD, one year of experience required
Master's Level Degree
Experience developing MLOps pipelines for computer vision AI models
Hands on experience developing custom machine learning algorithms from scratch (e.g., in NumPy or PyTorch)
Designed and implemented shared machine learning service that is used across multiple teams or production projects
Led the development of systems that automate the deployment and maintenance of multiple machine learning models into user-facing products
Five years of industry experience in data science, with at least 3 of those years as a Senior Machine Learning Engineer

Senior Machine Learning Engineer - Healthcare

Key skills

About this role

Responsibilities:

Requirements: