OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. They are seeking a Backend Software Engineer to design and build an evals infrastructure that measures the quality of OpenAI’s support automation, collaborating closely with data science and research partners.
Responsibilities:
- Design eval pipelines that are reliable, reproducible, and extendable
- Build the infrastructure for continuous eval monitoring frameworks (regression/drift monitoring, building robust golden datasets) along with feedback loops that ultimately strengthen support automation
- Design, build, and maintain backend services and APIs to support intelligent automation and knowledge systems
- Integrate and structure data across internal platforms, transforming it into formats optimized for use by downstream systems and AI workflows
- Collaborate closely with data, research, and engineering teams to integrate OpenAI models into high-leverage workflows
- Own the full development lifecycle of new backend systems and internal platform capabilities
- Build with scale and maintainability in mind, while rapidly iterating on new ideas
Requirements:
- 4+ years of backend engineering experience at product-driven companies (excluding internships)
- Proficiency in backend technologies. Our tech stack includes Python, FastAPI, and Postgres
- Experience designing and scaling distributed systems, APIs, or data processing pipelines
- Have experience building AI agents or applications, including designing evals and improving performance through prompting or scaffolding
- Are familiar with evaluation methods for LLMs and have worked with patterns like multi-agent workflows, tool use, or long context
- Experience creating production evals and/or measuring performance of ML/LLM models at scale
- A pragmatic mindset. You're comfortable shipping iteratively while building toward a long-term vision