SambaNova is a leader in generative AI technology, providing a full-stack AI platform optimized for enterprise and government organizations. The Principal Cloud Backend Engineer will architect and build core systems for AI inference services, focusing on monetization strategies and ensuring reliability and scalability of the platform.
Responsibilities:
- Lead the technical vision and architecture for our inference serving and monetization platform
- Design systems that are fault-tolerant, highly available, and can scale to meet growing demand while accurately tracking usage for billing
- Architect the core systems for flexible monetization, including:
- Designing a flexible system to define and enforce complex usage plans, rate limits, and access policies
- Building a highly reliable and accurate system to meter usage (e.g., tokens, requests) at scale and prepare data for billing
- Designing clean abstractions and APIs to seamlessly integrate with external billing and payment providers (e.g., Stripe, Metronome)
- Architect and implement complex distributed systems involving real-time rate limiting, quota enforcement, and fair-share scheduling for a multi-tenant environment
- Identify and eliminate bottlenecks in the end-to-end system, ensuring low-latency request handling while maintaining precise financial accuracy
- Serve as a technical leader and mentor. Establish best practices in code quality, testing, and observability for business-critical financial data pipelines
- Work closely with Product Management, Finance, and GTM teams to translate business requirements for new pricing models (e.g., subscriptions, pay-as-you-go, custom enterprise plans) into scalable technical solutions
Requirements:
- 10 + years of experience in software engineering, with a significant focus on designing and building large-scale, distributed backend systems in cloud environments
- 5 + years in a Principal or Lead Engineer role, with a proven track record of architecting, delivering, and operating business-critical platforms
- Expert proficiency in one or more of the following: Go, Rust and C++. Deep understanding of concurrency, performance optimization, and systems programming
- Deep, hands-on experience with cloud-native technologies (Kubernetes, Docker, etc.) and major cloud providers (AWS, GCP, Azure)
- Extensive experience with both SQL and NoSQL databases (e.g., PostgreSQL, Redis) and designing data models for high-throughput, low-latency applications
- Strong foundation in API design (REST, gRPC), event-driven architecture, and building resilient microservices
- Excellent communication and leadership skills, with the ability to drive technical consensus and articulate complex concepts to a diverse audience
- Direct Monetization/Billing Experience: Proven experience building or significantly extending platforms for usage-based metering, subscription management, entitlements, or billing systems. Experience with billing providers (e.g., Stripe, Metronome) is a strong plus
- Experience in AI/ML Infrastructure: Direct experience building or operating platforms for serving, scaling, and managing AI models (e.g., inference servers, model deployment pipelines)