Reddit is a community-driven platform known for its diverse conversations and vast user engagement. The Staff Research Engineer for Pre-training Science will lead the development of foundational Large Language Models tailored to Reddit's unique culture, focusing on Continual Pre-Training strategies and multimodal data integration.

Responsibilities:

Architect and validate rigorous Continual Pre-Training (CPT) frameworks, focusing on domain adaptation techniques that effectively transfer Reddit’s knowledge into licensed frontier models
Design the "Science of Multimodality": Lead research into fusing vision and language encoders to process Reddit’s rich media (images, video) alongside conversational text threads
Formulate data curriculum strategies: scientifically determining the optimal ratio of "Reddit data" vs. "General data" to maximize community understanding while maintaining safety and reasoning capabilities
Conduct deep-dive research into Scaling Laws for Graph-based data: investigating how Reddit’s tree-structured conversations impact model convergence compared to flat text
Design and scale continuous evaluation pipelines (the "Reddit Gym") that monitor model reasoning and safety capabilities in real-time, enabling dynamic adjustments to training recipes
Drive high-stakes architectural decisions regarding compute allocation, distributed training strategies (3D parallelism), and checkpointing mechanisms on AWS Trainium/Nova clusters
Serve as a force multiplier for the engineering team by setting coding standards, conducting high-level design reviews, and mentoring senior engineers on distributed systems and ML fundamentals

Requirements:

7+ years of experience in Machine Learning engineering or research, with a specific focus on LLM Pre-training, Domain Adaptation, or Transfer Learning
Expert-level proficiency in Python and deep learning frameworks (PyTorch or JAX), with a track record of debugging complex training instabilities at scale
Deep theoretical understanding of Transformer architectures and Pre-training dynamics—specifically regarding Catastrophic Forgetting and Knowledge Injection
Experience with Multimodal models (VLM): understanding how to align image/video encoders (e.g., CLIP, SigLIP) with language decoders
Experience implementing continuous integration/evaluation systems for ML models, measuring generalization and reasoning performance
Demonstrated ability to communicate complex technical concepts (like loss spikes or convergence issues) to leadership and coordinate efforts across Infrastructure and Data teams
Published research or open-source contributions in Continual Learning, Curriculum Learning, or Efficient Fine-Tuning (LoRA/Peft)
Experience with Graph Neural Networks (GNNs) or processing tree-structured data
Proficiency in low-level optimization (CUDA, Triton) or distributed training frameworks (Megatron-LM, DeepSpeed, FSDP)
Familiarity with Safety alignment techniques (RLHF/DPO) to understand how pre-training objectives impact downstream safety

Staff Research Engineer, Pre-training Science

Key skills

About this role

Responsibilities:

Requirements: