Yahoo is a leading technology company connecting brands and partners with a vast audience. As a Senior ML Engineer, you will develop and optimize machine learning solutions to transform user data into actionable insights, collaborating with cross-functional teams to deliver personalized experiences and enhance audience engagement.
Responsibilities:
- Develop and optimize end-to-end ML solutions for audience segmentation, predictive modeling, and behavioral enrichment at 2.5B+ profile scale
- Build reliable production pipelines for training, evaluating, and deploying ML models using GCP infrastructure (Vertex AI, Dataflow, Composer)
- Design robust feature engineering pipelines using large-scale data processing frameworks (Spark, Beam, BigQuery)
- Implement comprehensive monitoring solutions tracking model performance, data drift, prediction quality, and business impact metrics
- Tune, validate, and optimize ML models for accuracy, efficiency, and scalability while managing computational costs
- Collaborate with Data Science teams to productionize research models and translate prototypes into scalable production systems
- Partner with Product teams to understand business requirements and deliver ML capabilities for audience intelligence
- Apply ML engineering best practices including version control (Git), automated testing, CI/CD workflows, and model versioning
- Create comprehensive documentation for ML systems, feature pipelines, model artifacts, and operational runbooks
- Improve model efficiency, inference latency, and resource utilization for cost-effective production serving
- Troubleshoot data pipeline failures, model serving issues, and data quality problems in production environments
- Participate in technical discussions, code reviews, and knowledge sharing across teams
Requirements:
- Bachelor's or Master's degree in Computer Science, Data Science, Statistics, Machine Learning, or related technical field
- 5+ years software engineering experience building production systems
- 3+ years in ML engineering, data science, or applied machine learning roles
- 2+ years implementing and deploying ML models to production environments at scale
- 2+ years hands-on experience with GCP (BigQuery, Dataproc, Composer, Dataflow, Vertex AI) or AWS equivalents
- Strong Python and/or Java programming skills for production ML systems
- Proficiency with ML frameworks: TensorFlow, PyTorch, scikit-learn, XGBoost for building and deploying models
- Hands-on experience with data processing frameworks: Apache Spark, Apache Beam, or equivalent distributed computing tools
- Solid ML fundamentals: supervised/unsupervised learning, feature engineering, model evaluation, hyperparameter tuning
- SQL proficiency for large-scale data manipulation and feature extraction
- Understanding of MLOps practices: model versioning, A/B testing, monitoring, feature stores, model serving
- Demonstrated ability translating business requirements into technical ML solutions with measurable impact
- Strong problem-solving skills and analytical thinking in complex data environments
- Excellent collaboration with cross-functional teams including product, data science, and engineering
- Team-level impact with ability to influence technical decisions within immediate team
- Understanding of data privacy compliance (GDPR, CCPA) in ML systems
- Experience with audience segmentation, propensity modeling, recommendation systems, or user behavior prediction
- Knowledge of privacy-preserving ML techniques and compliance requirements (GDPR, CCPA, differential privacy)
- Familiarity with MLOps tools: MLflow, Kubeflow, Vertex AI Pipelines, Weights & Biases
- Prior experience in adtech, marketing technology, or consumer analytics platforms
- Understanding of A/B testing methodologies, experimentation frameworks, and causal inference
- Experience with online learning, real-time model serving, or feature streaming
- Contributions to ML open-source projects, technical publications, or conference presentations
- Self-driven, detail-oriented, strong problem-solving abilities in fast-paced environments