Archetype AI is developing the world's first AI platform to bring AI into the real world, formed by a high-caliber team from Google. They are seeking a highly motivated backend engineer to design and develop performant, scalable, and resilient inference services, working closely with researchers and product teams to bring AI capabilities into production.
Responsibilities:
- Architect, implement, and maintain distributed inference serving systems that support high-throughput, low-latency model serving across multiple AI accelerator families and cloud platforms
- Enable breakthrough research by providing scientists with high-performance inference infrastructure to develop next-generation models
- Continuously optimize inference performance—including batching, caching, and request routing strategies—to maximize compute efficiency under explosive customer growth
- Build tooling and observability to monitor system health, identify bottlenecks, and proactively resolve instability
- Introduce new techniques, architectures, and best practices to push the limits of scalability, efficiency, and reliability
- Own problems end-to-end—from design to deployment—with a strong bias toward quality, automation, and continuous improvement
- Balance rapid iteration on early-stage systems with long-term maintainability and architectural soundness
- Contribute to a culture of engineering excellence, mentorship, and team-first collaboration