San Francisco, California, United States of America
Full Time
5 hours ago
$150,000 - $250,000 USD
No Visa Sponsorship
Key skills
AI
About this role
Role Overview
Design and run post-training workflows that improve the behavior, reliability, and usefulness of AI systems
Develop datasets, preference signals, evaluation suites, reward models, fine-tuning workflows, and feedback loops for applied AI use cases
Investigate how different post-training techniques affect system behavior across enterprise workflows and production constraints
Build infrastructure for experimentation, model comparison, regression testing, and behavior analysis
Partner with AI Researchers to explore new post-training methods and with AI Engineers to apply successful techniques in deployed systems
Analyze model outputs, failure modes, human feedback, and production traces to identify opportunities for behavioral improvement
Create repeatable processes for adapting AI systems to customer domains while preserving robustness, transparency, and maintainability
Communicate clearly with internal teams and customer stakeholders about model behavior, evaluation results, limitations, and tradeoffs
Requirements
Experience Improving Model Behavior: You have worked with fine-tuning, preference optimization, reinforcement learning, reward modeling, synthetic data, evals, or related post-training techniques
Strong Programming and Experimentation Skills: You can build training and evaluation pipelines, run controlled experiments, analyze results, and iterate quickly
Research-Oriented Builder: You care about understanding why behavior changes, not just whether a benchmark improves
AI Systems Mindset: You understand that model behavior is shaped by data, prompts, tools, retrieval, evaluators, and deployment context—not model weights alone
AI-Native Working Style: You use AI tools daily to accelerate coding, analysis, debugging, experimentation, and research exploration
Bias Towards Measurement: You make behavioral improvements concrete through evaluations, comparisons, regression tests, and production-relevant metrics
Comfort with Applied Constraints: You can balance research ambition with practical constraints around cost, latency, reliability, data availability, and customer requirements
Ownership Mentality: You take responsibility for whether post-training work improves real system outcomes, not just offline scores
Benefits
100% covered medical, dental, and vision for employees and dependents
401(k) with additional perks (e.g., commuter benefits, in‑office lunch)
Access to state‑of‑the‑art models, generous usage of modern AI tools, and real‑world business problems
Ownership of high‑impact projects across top enterprises
A mission‑driven, fast‑moving culture that prizes curiosity, pragmatism, and excellence