Archetype AI is developing an innovative AI platform aimed at transforming real-world data into valuable insights. They are seeking a highly motivated backend engineer to architect and maintain distributed systems that support AI model inference and data services, collaborating closely with researchers and product teams.
Responsibilities:
- Architect, implement, and maintain distributed systems that support high-throughput, low-latency AI model inference and data services
- Partner with ML researchers and product teams to turn experimental models into production-grade services
- Continuously optimize performance across GPU clusters, cloud infrastructure, and backend systems
- Build tooling and observability to monitor system health, identify bottlenecks, and proactively resolve instability
- Introduce new techniques, architectures, and best practices to push the limits of scalability, efficiency, and reliability
- Own problems end-to-end—from design to deployment—with a strong bias toward quality, automation, and continuous improvement
- Balance rapid iteration on early-stage systems with long-term maintainability and architectural soundness
- Contribute to a culture of engineering excellence, mentorship, and team-first collaboration
Requirements:
- 5+ years of professional software engineering experience, with a focus on backend or distributed systems
- Deep understanding of distributed systems fundamentals—concurrency, consistency, replication, fault tolerance, networking
- Experience building and operating production-grade systems at scale in cloud environments (e.g., Azure, AWS, GCP)
- Strong debugging, instrumentation, and observability skills across distributed systems
- Demonstrated ownership of complex technical problems and ability to learn and adapt quickly
- Proven track record of scaling systems through rapid growth and rebuilding or refactoring for new demands
- Proficiency in systems programming languages (e.g., Rust, C++) and scripting environments (e.g., Python)
- Experience designing internal tools or platforms to support developer productivity and experimentation
- Strong product intuition, and ability to collaborate closely with cross-functional teams including research and design
- Familiarity with modern ML stacks and hardware acceleration (e.g., PyTorch, CUDA)