Lead the architecture and delivery of end-to-end AI-powered systems, including agents, RAG pipelines, orchestration layers, and reasoning workflows.
Translate product vision into scalable technical systems.
Define contracts, state management strategies, and guardrails for AI-driven workflows.
Own and evolve API contracts that AI systems interact with, ensuring reliability, idempotency, authentication safety, and rate limiting.
Design schema enforcement and validation layers for AI-generated outputs.
Implement retries, fallback strategies, and failure-mode containment.
Establish evaluation frameworks for benchmarking, regression testing, and drift detection.
Create observability standards for AI systems, including structured logging, telemetry, tracing, and performance monitoring.
Productionize experimental AI capabilities into scalable, secure services.
Establish architectural patterns and standards adopted across teams.
Mentor engineers in AI-native and spec-driven development practices.
Influence engineering culture through clarity, urgency, and execution.
Decompose high-level business outcomes into executable technical systems.
Requirements
4+ years building AI-augmented product capabilities (LLMs, RAG systems, agents, orchestration frameworks).
Event-Driven & Asynchronous Systems: Experience designing decoupled systems using queues such as Kafka, SQS, or BullMQ, and implementing asynchronous workflows that prevent blocking operations in user-facing systems.
State Management Strategy: Experience persisting state across sessions, managing context windows efficiently, and handling concurrency and race conditions when multiple agents interact with shared data.
Structured Data Enforcement: Experience enforcing structured outputs using schema validation tools such as Pydantic, Zod, or JSON Schema to ensure AI-generated outputs are reliable and machine-readable.
API Design & Integration: Strong understanding of REST, GraphQL, or RPC interface design, along with authentication, rate limiting, and idempotent API patterns.
Tech Stack
GraphQL
Kafka
Benefits
Unlimited PTO: Trust-based time off so you can recharge and bring your best self to work.
Comprehensive benefits: Medical, dental, and vision coverage to support you and your family’s health and well-being.
Learning & development programs: Access to training, mentorship, and development resources to grow your skills — from HR operations to total rewards strategy.