Cohere is dedicated to scaling intelligence to serve humanity by training and deploying frontier models for AI systems. The Senior Research Scientist in Model Evaluation will create next-generation evaluation methods and infrastructure to measure LLM progress, while working on cross-functional teams to enhance model evaluation techniques.

Responsibilities:

Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish
Work on highly cross-functional teams to translate model feedback into trustworthy, repeatable evaluations
Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges; refining LLM-based data synthesis pipelines; and improving evaluation efficiency
Build scalable and reusable tools for digging into model performance

Senior Research Scientist, Model Evaluation

Key skills

About this role

Responsibilities: