Polymath is an applied research lab focused on advancing long-horizon agent capabilities through reinforcement learning. They are seeking talented researchers currently enrolled in MS or PhD programs to collaborate on a research project aimed at developing benchmarks and training autonomous agents for complex tasks.

Responsibilities:

Identifying failure modes in frontier models
Developing rigorous benchmarks that evaluate how well frontier agents perform on complex, realistic tasks requiring long-horizon reasoning and tool use in dynamic environments
Training autonomous agents that can reason, plan, and act over extended time horizons

AI Research Resident (Academia)

About this role

Responsibilities: