Apollo Research is focused on assessing the risks posed by scheming AIs. The Research Scientist/Engineer will run evaluation campaigns on advanced AI systems, analyze model behaviors, and develop new evaluations to address frontier risks.
Responsibilities:
- Run pre-deployment evaluation campaigns on the most capable AI systems in the world
- Deep dive into AI cognition
- Build new evaluations for frontier risks, from designing novel test environments to scaling them across hundreds of distinct scenarios
- Work directly with frontier AI developers
- Automate and improve the evaluation pipeline