Role Overview
- Proactively identify, prioritize, and curate relevant public and client-driven benchmarks across our target use cases and markets.
- Evaluate candidate benchmarks for clarity, data quality, evaluation methodology, and fit with our model roadmap.
- Run benchmarks with baseline models to validate setup, uncover edge cases, and de‑risk R&D runs.
- Hand off “benchmark-ready” packages to R&D (specs, data, evaluation scripts, expected metrics, constraints).
- Maintain a shared vocabulary and documentation around benchmarks, datasets, and evaluation formats that GTM and R&D can both use.
- Track and organize benchmark results, model leaderboards, and “what good looks like” for different customers and scenarios.
- Contribute to demos and public‑facing proof points based on benchmark outcomes.
Requirements
It's always a pleasure to say hi! If you could leave us 2-3 lines, we'd really appreciate that.
- You are expected to meet at least one of the following criteria:
- You were an ICPC World Finalist, or an IOI, IMO, IOAI or IPhO medalist in High School.
- You have published a research paper at an A-rated o A*-rated venue (according to ICORE).
- You have completed coding projects
- ideally with a GitHub repository showcasing previous work.
- You were an intern at a leading Machine Learning research center (e.g. at: Google Brain / Deepmind, Apple, Meta, Anthropic, Nvidia, MILA).
- You can get a warm recommendation from you university faculty member.
- Have experience with ML/LLM evaluation, data science, or technical product roles, ideally around benchmarks or experimentation.
- Are comfortable reading papers, leaderboards, and Github repos, and turning them into clear, repeatable benchmark specs.
- Can talk comfortably with both engineers and customers, and translate between technical detail and business value.
- Care about high-quality data, reproducible experiments, and crisp documentation.
- Are respectful of others.
- Are fluent in English.
Benefits
- Join an intellectually stimulating work environment.
- During your internship, you will collaborate on a cutting edge research project.
- Be a pioneer: you get to work with a new type of "Live AI" challenges.
- Be part of one of an early-stage AI startup that believes in impactful research and foundational changes.