San Francisco, California, United States of America
Full Time
5 hours ago
$150,000 - $250,000 USD
No Visa Sponsorship
Key skills
PythonAIAgentic
About this role
Role Overview
Design, prototype, and implement agentic AI systems that perform reliably across complex enterprise workflows
Build compound AI architectures that combine planning, tool use, retrieval, memory, evaluation, orchestration, and execution
Investigate how agents reason, coordinate, recover from errors, and interact with external systems under real-world constraints
Develop evaluation frameworks that measure agent behavior, task completion, reliability, robustness, and failure modes
Create tools and abstractions that make agent behavior easier to observe, debug, test, and improve
Partner with AI Researchers to explore new agent architectures and with AI Engineers to harden successful approaches for production use
Integrate agents into customer APIs, applications, data platforms, and operational workflows
Communicate clearly with internal teams and customer stakeholders about agent capabilities, limitations, tradeoffs, and risks
Requirements
Experience Building Agentic Systems: You have built AI systems that use models, tools, retrieval, planning, memory, or multi-step execution to complete real tasks
Strong Engineering Fundamentals: You write clean, maintainable Python and are comfortable debugging complex, stateful systems
Systems-Level Reasoning: You think holistically about how prompts, tools, context, evaluators, state, orchestration, and external APIs interact
Research-Oriented Builder: You are curious about why agents succeed or fail, and you can design experiments to test different architectures and behaviors
AI-Native Working Style: You use AI tools daily to write code, debug systems, explore designs, analyze traces, and accelerate experimentation
Bias Towards Showing vs. Telling: You prefer working demonstrations, traces, evaluations, and production behavior over abstract descriptions
Comfort in Customer Environments: You can translate ambiguous business workflows into concrete agent designs and explain system behavior clearly to stakeholders
Ownership Mentality: You take responsibility for whether an agentic system performs reliably, safely, and usefully in production
Tech Stack
Python
Benefits
100% covered medical, dental, and vision for employees and dependents
401(k) with additional perks (e.g., commuter benefits, in‑office lunch)
Access to state‑of‑the‑art models, generous usage of modern AI tools, and real‑world business problems
Ownership of high‑impact projects across top enterprises
A mission‑driven, fast‑moving culture that prizes curiosity, pragmatism, and excellence