Cube is a company focused on redefining how organizations deliver and automate data and analytics. As a Staff DevOps Engineer, you will set the technical direction for Cube Cloud and the agentic analytics platform, collaborating with various teams to evolve multi-cloud architecture and enhance developer productivity.
Responsibilities:
- Set the technical direction for the infrastructure that runs Cube Cloud and the agentic analytics platform
- Own complex, high-impact initiatives end-to-end, from architecture to rollout
- Collaborate closely with engineering, security, and core platform teams to evolve our multi-cloud architecture
- Harden our agentic analytics runtime for production
- Turn operational challenges into elegant, automated systems
- Design the infrastructure for AI-driven workloads, ensuring they are fast, reliable, and safe
- Drive design decisions for hybrid and self-hosted deployments
- Lead improvements to CI/CD, build and release systems, and internal developer platform
- Lead initiatives across IAM, network security, secrets management, audit, incident response, and SLO practice
Requirements:
- Deep understanding of major cloud environments (AWS, GCP, Azure), including networking, IAM, and managed services at scale
- Strong expertise with Kubernetes, operating, upgrading, and tuning production clusters in multi-tenant environments
- Strong experience with IaC tools such as Terraform, Pulumi, or similar, and modern GitOps workflows
- Solid background in designing and operating CI/CD systems and internal developer platforms
- Ability to write production-quality code in TypeScript/JavaScript, Python, Go, or similar
- Track record of leading large infrastructure initiatives end-to-end, influencing technical strategy, and mentoring other engineers
- Strong grasp of observability, incident response, and reliability engineering for distributed systems
- Good communication skills
- Previous startup experience or a genuine interest in working in a fast-moving company with a high level of ownership
- Strong knowledge of TypeScript and experience integrating with Node.js-based services
- Hands-on experience with Pulumi
- Experience writing code in Rust (Cube's core query engine is written in Rust)
- Experience operating multi-tenant SaaS platforms and supporting self-hosted / BYOC deployments
- Experience running infrastructure for AI/LLM workloads or building MLOps tooling
- Background in data engineering, analytics applications, or OLAP systems
- Compliance experience (SOC 2, HIPAA, ISO 27001)