Pinterest is a platform where millions of people find creative ideas and plan for memories. They are seeking a Staff Machine Learning Engineer to lead modeling strategies and develop production models for content understanding, focusing on high-quality semantic signals that enhance product relevance and integrity.
Responsibilities:
- Lead modeling strategy for content understanding (vision, NLP, multimodal), including architecture selection, training approach, and evaluation methodology
- Design and ship production models that generate content signals such as embeddings and classifications used across multiple product surfaces
- Own the full ML lifecycle: data/labeling strategy (human labels + weak supervision), training pipelines, offline evaluation, online experimentation, deployment, and monitoring/retraining
- Partner with infra/platform teams to ensure scalable, reliable training/serving (latency, cost, observability, rollout safety)
- Collaborate with signal-consuming teams (ranking, retrieval, integrity, ads) to define signal contracts, adoption patterns, and success metrics
- Provide technical leadership through design reviews, mentoring, and raising the quality bar for modeling and ML engineering practices
Requirements:
- M.S/ PhD degree in Computer Science, Statistics or related field
- Significant industry experience building software and ML pipelines/systems, including technical leadership (project/tech lead or equivalent)
- Strong proficiency in Python and at least one ML stack such as PyTorch / TensorFlow, plus solid software engineering fundamentals
- Proven experience training and deploying ML models to production, including model versioning, rollouts, monitoring, and retraining strategies
- Deep hands-on experience in content understanding domains, such as: computer vision (classification, detection, representation learning), NLP (text classification, entity/topic modeling), multimodal / embedding models (e.g., transformer-based representations)
- Experience working with large-scale datasets and distributed compute (e.g., Spark-like ecosystems, distributed training, GPU environments)
- Strong applied skills in evaluation and experimentation: defining metrics, offline/online alignment, A/B testing, debugging regressions, and model quality analysis
- Demonstrated ability to influence across teams and drive ambiguous problem areas to measurable outcomes