UT MD Anderson is a nationally recognized cancer center, seeking a Machine Learning Engineer – Platforms to support the development and scalability of its enterprise AI/ML platform. The role involves hands-on engineering, focusing on MLOps, platform integration, and collaboration with data scientists and IT teams to enhance AI solutions in healthcare.
Responsibilities:
- Support development, administration, and maintenance of the enterprise AI/ML platform (Dataiku, Kubernetes, Azure), ensuring scalability, reliability, and smooth integration with institutional systems
- Orchestrate training, deployment, and inference pipelines within Dataiku targeting Azure and on‑premises Kubernetes clusters
- Develop and maintain MLOps workflows for reproducibility, version control, governance, and model lifecycle management
- Manage and optimize containerized environments using Docker and Kubernetes to support data science workloads
- Provide platform support for data scientists and ML engineers, troubleshooting environment, pipeline, and dependency issues
- Monitor platform performance, cost, security, and compliance, ensuring alignment with enterprise and regulatory standards
- Build and support scalable pipelines in Dataiku, Kubernetes, and Azure, including feature engineering, model tracking, and validation workflows
- Debug, test, and resolve complex platform or pipeline issues using strong analytical and problem‑solving skills
- Assist with healthcare data integration using standards such as HL7, FHIR, or DICOM when required for model development
- Share platform knowledge, best practices, and methodologies through training, documentation, and cross‑team collaboration
- Support analytics and automation workflows by enabling access to data, reviewing project requests, and assisting with interpretation
- Communicate platform updates, risks, performance, and issue resolutions clearly during meetings and collaborative sessions
- Work effectively with leaders, technical peers, and end users, ensuring strong communication across both technical and non‑technical stakeholders
- Perform additional tasks as assigned to support the AI/ML platform, MLOps practices, and enterprise data science initiatives
Requirements:
- Bachelor's Degree Computer Science, Software Engineering, Data Science, Physics, Math & Statistics, or another related engineering discipline
- 3 years in machine learning engineering, data science, data engineering, and/or software engineering experience
- 1 year experience with Master's degree
- Master's Degree Computer Science, Software Engineering, Data Science, Physics, Math & Statistics, or another related engineering discipline
- Healthcare experience needed
- Experience with MLOps platforms and/or cloud AI certifications
- Strong proficiency in CI/CD and automation of the AI lifecycle
- Experience working on healthcare focused machine learning projects
- Experience with Azure and/or Kubernetes
- Proficiency in services such as Azure Kubernetes Services and Azure ML (or similar)