Triumph is focused on creating a seamless freight transaction network and is seeking a Machine Learning Engineer to join their team. In this role, you'll be responsible for deploying and optimizing ML models, building and maintaining infrastructure, and collaborating with teams to ensure high performance and reliability of ML applications.
Responsibilities:
- Deploy and productionize ML models developed by the Data Science teams
- Build and maintain production ML infrastructure with automated pipelines, monitoring, alerting, and CI/CD practices to ensure high availability and reliability
- Develop and integrate complex business logic in Python to embed models into production systems and workflows
- Scale and optimize model performance and serving infrastructure (latency, throughput, caching, quantization) to meet strict production SLAs
- Construct and optimize data pipelines that feed production ML systems
- Collaborate with US stakeholders to translate production metrics and system behavior into business insights when needed
Requirements:
- Strong communication skills with the ability to explain the 'why' behind technical decisions to diverse audiences
- 4+ years of software engineering experience building and maintaining production ML systems
- Proficiency with ML frameworks (scikit-learn, LightGBM, XGBoost)
- Strong production systems knowledge: Docker, Kubernetes, AWS (S3, ECS/EKS, SageMaker)
- Experience building and optimizing data pipelines at scale using ETL tools, and workflow orchestration platforms (Prefect, Airflow)
- Strong testing and CI/CD practices (automated testing, deployment pipelines)
- Monitor and maintain production ML systems with alerting, logging, and retraining workflows
- Ability to work core hours on Eastern Standard Time (EST) to facilitate a 3–4 hour daily overlap with our EU-based engineering team
- Leadership experience in engineering teams or technical consulting
- Contributions to open-source ML projects
- Experience optimizing and monitoring model behavior using sophisticated hyperparameter tuning methods and specialized anomaly detection techniques
- Previous work in the logistics or supply chain domain
- Published work or presentations on ML engineering topics