Duetto is a high-growth global technology company transforming the hotel industry with innovative analytical solutions. They are seeking a Senior Machine Learning Platform Engineer to build and scale their machine learning infrastructure and workflows, ensuring the delivery of robust ML models for hotel customers.
Responsibilities:
- Develop, maintain, and scale machine learning pipelines for training, validation, and batch or real-time inference across thousands of hotel-specific models
- Build reusable components to support model training, evaluation, deployment, and monitoring within a Kubernetes- and AWS-based environment
- Partner with data scientists to translate notebooks and prototypes into production-grade, versioned training workflows
- Implement and maintain feature engineering workflows, integrating with custom feature pipelines and supporting services
- Collaborate with platform and DevOps teams to manage infrastructure-as-code (Terraform), automate deployment (CI/CD), and ensure reliability and security
- Integrate model monitoring for performance metrics, drift detection, and alerting (using tools like Prometheus, CloudWatch, or Grafana)
- Improve retraining, rollback, and model versioning strategies across different deployment contexts
- Support experimentation infrastructure and A/B testing integrations for ML-based products
Requirements:
- 3+ years of experience in ML engineering or a similar role building and deploying machine learning models in production
- Strong experience with AWS ML services (SageMaker, Lambda, EMR, ECR) for training, serving, and orchestrating model workflows
- Hands-on experience with Kubernetes (e.g., EKS) for container orchestration and job execution at scale
- Strong proficiency in Python, with exposure to ML/DL libraries such as TensorFlow, PyTorch, scikit-learn
- Experience working with feature stores, data pipelines, and model versioning tools (e.g., SageMaker Feature Store, Feast, MLflow)
- Familiarity with infrastructure-as-code and deployment tools such as Terraform, GitHub Actions, or similar CI/CD systems
- Experience with logging and monitoring stacks such as Prometheus, Grafana, CloudWatch, or similar
- Experience working in cross-functional teams with data scientists and DevOps engineers to bring models from research to production
- Strong communication skills and ability to operate effectively in a fast-paced, ambiguous environment with shifting priorities