Snorkel AI is a data development company spun out of the Stanford AI Lab, focused on improving the truthfulness and reasoning abilities of advanced AI systems. This role involves designing challenging tasks, stress-testing AI agents, and providing expert feedback to shape the next generation of AI models.
Responsibilities:
- Help shape the next generation of AI models by designing challenging tasks, stress-testing AI agents, and providing expert feedback that steers model behavior toward ideal solutions
- Complete real-world, end-to-end tasks in terminal environments: compiling code, training models, setting up servers, and handling complex system administration autonomously
- Design complex, verifiable terminal tasks spanning Python scripting, automation, data pipelines, software engineering, system administration, and infrastructure — each with a clear, reproducible solution and automated verification criteria
Requirements:
- Deep fluency in Python and Linux/Unix environments, shell scripting, containerization (Docker/Kubernetes), and package/environment management
- Strong code review instincts — able to identify suboptimal solutions, suggest improvements, and evaluate competing approaches across multiple turns of a conversation
- Proficiency in coding with at least one major language (e.g., Python, C++, Java, Go, JavaScript)
- Background in one or more of: DevOps, SRE, Platform Engineering, Backend/Systems Engineering, or Security/Penetration Testing
- Familiarity with agent evaluation frameworks, CI/CD integration, or long-horizon planning in automated pipelines