Twilio is a company focused on shaping the future of communications through innovative solutions for businesses. The Machine Learning Engineer will drive the development of cutting-edge products, collaborating with cross-functional teams to create scalable ML-based systems that enhance customer experiences.

Responsibilities:

Partner with product, UX, and technical stakeholders to analyze business problems, clarify requirements, define scope, and translate them into measurable ML problem statements
Design, implement, and maintain scalable, enterprise-grade ML solutions in production
Build reproducible ML workflows for data preparation, training, evaluation, and inference using modern orchestration and MLOps tooling
Implement monitoring and evaluation frameworks to continuously improve data quality, model performance, latency, and cost through feedback loops
Partner cross-functionally with Product, Data Science/ML, Engineering, and Security to deliver resilient, scalable, and compliant ML-powered services
Demonstrate end-to-end systems understanding and articulate the 'why' behind model and system design choices
Own operational excellence: SLAs, on-call, incident response, customer feedback triage, and blameless post-mortems
Drive engineering excellence via AI-assisted SDLC, code reviews, automated testing, MLOps best practices, knowledge-sharing, and mentoring
Actively adopt AI-assisted practices to improve implementation and collaboration efficiency

Requirements:

Strong foundation in ML/AI (statistics, probability, optimization) with the ability to apply these concepts to real-world problems
5+ years of experience building, deploying, and operating data and ML systems in production
Proficient in Python, Java, and SQL; strong software engineering fundamentals (system design, testing, version control, code reviews)
Hands-on experience with workflow orchestration and data pipelines (e.g., Airflow, Kubeflow) and cloud data platforms/storage (e.g., SageMaker Feature Store, Snowflake, DynamoDB, OpenSearch)
Experience with the ML lifecycle and MLOps tooling (e.g., MLflow, Metaflow, SageMaker; LLM/agent frameworks such as LangChain/LangGraph; model evaluation/observability tools such as Galileo or similar)
Working knowledge of containerization and cloud infrastructure, including Docker and Kubernetes, GitOps/CI/CD tools (e.g., Argo CD), and at least one major cloud platform (AWS, GCP, or Azure)
Understanding of data modeling and scalable systems, including distributed computing and streaming frameworks (e.g., Spark/EMR, Flink, Kafka Streams); familiarity with GPU-based implementation is a plus
Demonstrated ability to ramp up quickly and operate effectively in new application/business domains
Strong written and verbal communication skills: able to document and present designs and decisions, and comfortable giving/receiving feedback in an Agile environment
Familiarity with ML problem areas and techniques, including recommendation systems (e.g., graph-based approaches, two-tower models), time-series modeling (classical and deep learning), representation learning (e.g., embeddings), anomaly detection, and causal inference
Practical experience with LLMs and generative AI workflows, including foundation model fine-tuning, RAG, and vector databases
Evidence of technical leadership/impact, such as contributions to open-source data/ML projects and/or published technical presentations, blog posts, papers, or research
Domain experience (plus) in communications, marketing automation, or customer engagement analytics
Familiarity with AI-assisted development tools (e.g., Claude, GitHub Copilot/Codex, Cursor, etc.)
Advanced degree preferred (M.S. or Ph.D.) in a relevant field

Machine Learning Engineer

Key skills

About this role

Responsibilities:

Requirements: