You will design and implement scalable APIs/services for developing AI agents, including orchestration, memory, tool use, and learner interaction
You will implement rigorous evaluation frameworks to improve agent performance (offline and in-product), and build the instrumentation to make results actionable improvement cycles
You will build data flows/pipelines that allow real-time personalization and learning insights
You will ensure data quality and accessibility for training and evaluating learning agents
You will improve developer velocity with solid API design, testing practices, and realistic architecture choices
You will ship end-to-end product slices: implement backend capabilities and the corresponding lightweight web UI (and occasionally mobile) to unblock product progress
You will build lightweight internal tools/admin surfaces for evaluation, debugging, content/ops workflows, and product iteration
You will collaborate with product/design to deliver high-quality user experiences while keeping scope and maintainability sane
You will help establish the basics of production readiness: monitoring, dashboards, logging/tracing, and performance/cost hygiene
You will contribute to automated testing and deployment pipelines for rapid, safe iteration (CI/CD, rollbacks, environment hygiene)
You will participate in incident response when needed and help shape runbooks, postmortems, and an on-call model appropriate for our current stage
Requirements
5+ years production Python experience at scale (or equivalent)
AWS expertise (Lambda, S3, and CloudWatch)
Familiarity with compound systems combining LLMs, ML, and traditional software components
Experience with agentic AI architectures (e.g., CrewAI, LangGraph/LangChain, AutoGen, LlamaIndex, MCP, A2A, ACP, DSPy, or custom frameworks)
Hands-on experience with evaluation methodologies for LLMs and agent-based AI systems
Infrastructure as Code experience (CDK, Terraform, or similar)
GitHub Actions or similar CI/CD platforms
Experience communicating updates and resolutions to customers and other partners
Comfortable working across the stack when needed.
Read and write TypeScript, and ship small UI features using React/Next.js or similar technologies.
Additionally, the ability to contribute occasional mobile changes in iOS/Swift and Android/Kotlin.
Experience building production readiness / SRE foundations (instrumentation/observability basics, safe deployments, performance/cost hygiene)
AI-Augmented Development: Active user of AI development tools (GitHub Copilot, Cursor, Claude Code, Codex) with personal projects and evolution over the past 12 months
Demonstrated examples of AI-augmented productivity gains
Enthusiasm for pushing boundaries of AI-assisted engineering
Experience with a relational database such as MySQL, PostgreSQL, or Oracle.
Tech Stack
Android
AWS
iOS
JavaScript
Kotlin
MySQL
Next.js
Oracle
Postgres
Python
React
Swift
Terraform
TypeScript
Benefits
High-quality, low-deductible medical insurance
Low to no-cost dental and vision plans
5 weeks of paid time off (plus almost a dozen paid holidays)