Reflection AI is on a mission to build open superintelligence and make it accessible to all. The role involves conducting comparative analysis, building evaluation systems, and collaborating with teams to improve model capabilities.

Responsibilities:

Conduct critical comparative analysis to advance our understanding of model capabilities
Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior
Develop generalizable evaluation frameworks that capture what matters for reasoning, alignment, and usefulness
Collaborate closely with pre-training, post-training, and applied teams to translate insights into model improvements
Push the boundaries of what’s measurable, from synthetic evals to human feedback and real-world interaction data

Member of Technical Staff - Evaluations

Key skills

About this role

Responsibilities: