About this role

Cohere is on a mission to scale intelligence to serve humanity by training and deploying frontier models for AI systems. The Evaluation Frontend Software Engineer will focus on building tools for visualizing and analyzing model evaluations, collaborating with cross-functional teams to enhance the development of large language models.

Responsibilities:

Design tools and visualizations that enable researchers and engineers to compare and analyse hundreds of model evaluations. This includes both data visualization tools, as well as statistical tools to extract signal from the noisy signals we get
Develop an understanding of the relative merits and limitations of each of our model evaluations, as well as suggest new facets of model evaluation

Requirements:

Extremely strong software engineering skills
Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance
Prior experience building front-end visualization systems and dashboards
Familiarity with ML systems evaluations
Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX)
Excellent communication skills to collaborate effectively with cross-functional teams and present findings
One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)

Software Engineer, Evaluation Frontend

Key skills

About this role

Responsibilities:

Requirements: