Reddit is a community of communities, built on shared interests and trust, with millions of daily active users. They are seeking a Machine Learning Engineer to design and build production ML systems that power core experiences across the platform, including personalized recommendations and intelligent advertising systems.
Responsibilities:
- Design, build, and deploy production-grade machine learning models and systems at scale
- Own the full ML lifecycle: from problem definition and feature engineering to training, evaluation, deployment, and monitoring
- Build scalable data and model pipelines with strong reliability, observability, and automated retraining
- Work with large-scale datasets to improve ranking, recommendations, search relevance, prediction, content/user understanding, and optimization systems
- Partner cross-functionally with Product, Data Science, Infrastructure, and Engineering teams to translate complex problems into ML solutions
- Improve system performance across latency, throughput, and model quality metrics
- Research and apply state-of-the-art machine learning and AI techniques, including deep learning, graph & transformers based, and LLM evaluation/alignment
- Contribute to technical strategy, architecture, and long-term ML roadmap
Requirements:
- 3-5+ years of experience building, deploying, and operating machine learning systems in production
- Strong programming skills in Python, Java, Go, or similar languages, with solid software engineering fundamentals
- ML Fundamentals: a strong grasp of algorithms, from classic statistical learning (XGBoost, Random Forests, regressions) to DL architectures (Transformers, CNNs, GNNs)
- Hands-on experience with modern ML frameworks (e.g., PyTorch, TensorFlow)
- Experience designing scalable ML pipelines, data processing systems, and model serving infrastructure
- Ability to work cross-functionally and translate ambiguous product or business problems into technical solutions
- Experience improving measurable metrics through applied machine learning
- Experience with recommender systems, search/ranking systems, advertising/auction systems, large-scale representation learning, or multimodal embedding systems
- Familiarity with distributed systems and large-scale data processing (Spark, Kafka, Ray, Airflow, BigQuery, Redis, etc.)
- Experience working with real-time systems and low-latency production environments
- Background in feature engineering, model optimization, and production monitoring
- Experience with LLM/Gen AI techniques, including but not limited to LLM evaluation, alignment, fine-tuning, knowledge distillation, RAG/agentic systems and productionizing LLM-powered products at scale
- Advanced degree in Computer Science, Machine Learning, or related quantitative field