Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). They are seeking Research Engineers to build AI systems that utilize agent interaction data to enhance understanding and performance of agents in high-stakes workflows.
Responsibilities:
- Build systems to aggregate, index, and analyze large-scale agent interaction data to extract meaningful evaluation signals
- Develop agent-based systems for analyzing and evaluating complex, long-running behaviors
- Design and implement post-training and optimization workflows to improve agent behavior
- Build internal tools and infrastructure to support rapid experimentation, analysis, and training