IMO Health combines strengths in software development, artificial intelligence, and clinical expertise to create AI-driven solutions. They are seeking a Staff AI / MLOps Engineer to own the end-to-end machine learning lifecycle for production AI systems, focusing on operational excellence and architectural rigor.

Responsibilities:

Own the full ML lifecycle, including data ingestion, training, validation, deployment, monitoring, retraining, and retirement
Transition AI/ML prototypes into scalable, production-ready systems with CI/CD pipelines, automation, and observability
Lead system design and architecture discussions, providing guidance on ML systems, MLOps, and AI infrastructure
Develop and maintain AI-driven applications and inference services, optimizing for performance, scalability, reliability, and cost
Integrate LLMs, generative AI, and NLP solutions into IMO Health products, focusing on unstructured clinical data
Implement monitoring, alerting, logging, and dashboards to ensure model quality, detect drift, and maintain operational SLAs
Build, maintain, and optimize CI/CD pipelines, automation scripts, and Infrastructure-as-Code for production ML systems
Apply containerization (Docker, Kubernetes) and cloud infrastructure best practices to manage production environments
Mentor and guide engineers, enforce technical standards, and drive reduction of technical debt
Conduct root cause analysis of production defects and implement durable fixes
Advocate for non-functional requirements (availability, scalability, reliability, maintainability) and design systems accordingly
Collaborate cross-functionally with Product, Data Science, Architecture, and Engineering teams to align AI solutions with business goals

Requirements:

8+ years of professional experience in software engineering, AI/ML engineering, or related roles, building and operating production-grade systems
Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field (or equivalent experience)
Strong foundation in computer science fundamentals (data structures, algorithms, design patterns, operating systems, networking)
Expert-level coding skills in Python or Java, with a strong emphasis on production-quality software engineering practices
Hands-on experience owning ML systems in production, including deployment, monitoring, retraining, and optimization
Experience designing and operating CI/CD pipelines, automation, and observability for ML systems
Deep experience with cloud platforms (AWS or Azure), containerization, and Infrastructure-as-Code
Experience with MLOps tools and workflows (e.g., MLflow, SageMaker, Kubeflow)
Experience integrating and deploying LLMs, generative AI, and agentic systems in production environments
Working knowledge of NLP concepts (tokenization, embeddings, classification, sequence modeling); healthcare exposure is a plus
Experience with Elasticsearch and vector databases for embedding-based search and retrieval
Proven ability to translate business needs into scalable, reliable technical solutions, balancing technical debt and delivery velocity
Strong system design skills for high-performance, distributed, and scalable systems
Excellent communication and collaboration skills across cross-functional, distributed teams
Self-starter who can operate autonomously and own complex systems end to end
Experience with clinical or healthcare AI applications
Familiarity with Hugging Face, PyTorch, TensorFlow, or other modern ML frameworks
AWS Associate-level certification (Machine Learning Engineer or Solutions Architect)

Staff AI / MLOps Engineer - Clinical AI

Key skills

About this role

Responsibilities:

Requirements: