Reddit is a community of communities built on shared interests and trust, and is home to authentic conversations on the internet. The Senior Machine Learning Engineer will design and build production ML systems that enhance user experience and optimize advertising systems across the platform.
Responsibilities:
- Design, build, and deploy production-grade machine learning models and systems at scale
- Own the full ML lifecycle: from problem definition and feature engineering to training, evaluation, deployment, and monitoring
- Build scalable data and model pipelines with strong reliability, observability, and automated retraining
- Work with large-scale datasets to improve ranking, recommendations, search relevance, prediction, content/user understanding, and optimization systems
- Partner cross-functionally with Product, Data Science, Infrastructure, and Engineering teams to translate complex problems into ML solutions
- Improve system performance across latency, throughput, and model quality metrics
- Research and apply state-of-the-art machine learning and AI techniques, including deep learning, graph & transformers based, and LLM evaluation/alignment
- Contribute to technical strategy, architecture, and long-term ML roadmap
Requirements:
- 3-5+ years of experience building, deploying, and operating machine learning systems in production
- Strong programming skills in Python, Java, Go, or similar languages, with solid software engineering fundamentals
- ML Fundamentals: a strong grasp of algorithms, from classic statistical learning (XGBoost, Random Forests, regressions) to DL architectures (Transformers, CNNs, GNNs)
- Hands-on experience with modern ML frameworks (e.g., PyTorch, TensorFlow)
- Experience designing scalable ML pipelines, data processing systems, and model serving infrastructure
- Ability to work cross-functionally and translate ambiguous product or business problems into technical solutions
- Experience improving measurable metrics through applied machine learning
- Experience with recommender systems, search/ranking systems, advertising/auction systems, large-scale representation learning, or multimodal embedding systems
- Familiarity with distributed systems and large-scale data processing (Spark, Kafka, Ray, Airflow, BigQuery, Redis, etc.)
- Experience working with real-time systems and low-latency production environments
- Background in feature engineering, model optimization, and production monitoring
- Experience with LLM/Gen AI techniques, including but not limited to LLM evaluation, alignment, fine-tuning, knowledge distillation, RAG/agentic systems and productionizing LLM-powered products at scale
- Advanced degree in Computer Science, Machine Learning, or related quantitative field