Cohere is on a mission to scale intelligence to serve humanity by training and deploying frontier models for AI systems. The Evaluation Frontend Software Engineer will focus on building tools for visualizing and analyzing model evaluations, collaborating with cross-functional teams to enhance the development of large language models.
Responsibilities:
- Design tools and visualizations that enable researchers and engineers to compare and analyse hundreds of model evaluations. This includes both data visualization tools, as well as statistical tools to extract signal from the noisy signals we get
- Develop an understanding of the relative merits and limitations of each of our model evaluations, as well as suggest new facets of model evaluation
Requirements:
- Extremely strong software engineering skills
- Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance
- Prior experience building front-end visualization systems and dashboards
- Familiarity with ML systems evaluations
- Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX)
- Excellent communication skills to collaborate effectively with cross-functional teams and present findings
- One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)