Reflection AI is on a mission to build open superintelligence and make it accessible to all. The Alignment Lead will drive the alignment stack, lead research efforts on reward models, curate training data, and optimize RL pipelines to enhance model performance.

Responsibilities:

Drive the entire alignment stack, spanning instruction tuning, RLHF, and RLAIF, to push the model toward high factual accuracy and robust instruction following
Lead research efforts to design next-generation reward models and optimization objectives that significantly improve human preference (HP) performance
Curate high-quality training data and design synthetic data pipelines that solve complex reasoning and behavioral gaps
Optimize large-scale RL pipelines for stability and efficiency, ensuring rapid iteration cycles for model improvements
Collaborate closely with pre-training and evaluation teams to create tight feedback loops that translate alignment research into generalizable model gains

Member of Technical Staff - Alignment Lead

Key skills

About this role

Responsibilities: