Solve real business needs at large scale by applying your software engineering and analytical problem solving skills.
Design, implement and maintain scalable distributed systems for our cloud automation platform that include cloud control plane, Kubernetes container platform and traffic and networking.
Work directly with customers to quickly understand their critical problems and design and implement solutions.
Deploy and maintain availability of cloud compute servers and Kubernetes clusters that power the Snowflake platform in sensitive (sometimes air-gapped) production environments using automation.
Implement software delivery pipelines that support continuous delivery and automatic compliance in sensitive runtime environments.
Ensure operational readiness of the services and meet the commitments to our customers regarding security, reliability, availability, and performance.
Collaborate closely with product teams on requirements & SLOs for deploying software into air-gapped environments.
Identify, troubleshoot, and solve network & systems issues.
Use AI-driven tooling to automate operational tasks.
Requirements
7+ years of industry experience designing and supporting large-scale distributed systems in production, with recent experience in deploying at public sector customers
In-depth experience with container orchestration, cloud infrastructure and IaC tools such as Terraform or Pulumi
Strong CS fundamentals including data structures, algorithms, and distributed systems
In-depth development skills in Java, C++, Golang or Python
Experience with public cloud platforms such as AWS, Azure, or GCP
Experience with database systems and database internals preferred