Snorkel AI is a data development company spun out of the Stanford AI Lab, focused on improving the truthfulness and reasoning abilities of advanced AI systems. They are seeking an Expert Contributor - DevOps Engineer to complete real-world, end-to-end tasks in terminal environments and help shape the next generation of AI models. The role involves designing complex tasks, evaluating AI coding agents, and providing expert feedback to guide model behavior.

Responsibilities:

Design complex, verifiable terminal tasks spanning Python scripting, automation, data pipelines, software engineering, system administration, and infrastructure — each with a clear, reproducible solution and automated verification criteria
Challenge a coding agent with difficult, targeted prompts designed to expose weaknesses and push the boundaries of an existing codebase or repository
Provide multi-turn preference ratings and feedback as you guide the model iteratively toward an ideal solution — evaluating correctness, approach, and engineering judgment at each step
Develop detailed prompts that present realistic, end-to-end engineering scenarios, paired with structured checklists of specific criteria to evaluate correctness, efficiency, and approach
Write grading rubrics and scoring guides modeled on real task completion standards — accounting for edge cases, failure modes, and alternative valid approaches

Requirements:

Demonstrated ability to complete hard, end-to-end terminal tasks autonomously — spanning compiling code, training models, configuring servers, system administration, security tasks, data science workflows, and debugging systems
Deep fluency in Python and Linux/Unix environments, shell scripting, containerization (Docker/Kubernetes), and package/environment management
Strong code review instincts — able to identify suboptimal solutions, suggest improvements, and evaluate competing approaches across multiple turns of a conversation
Ability to think through multi-step terminal workflows from first principles and clearly articulate the reasoning, tradeoffs, and edge cases involved
Background in one or more of: DevOps, SRE, Platform Engineering, Backend/Systems Engineering, or Security/Penetration Testing
familiarity with agent evaluation frameworks, CI/CD integration, or long-horizon planning in automated pipelines

Expert Contributor - DevOps Engineer (Task Based)

Key skills

About this role

Responsibilities:

Requirements: