Develop software systems to support large scale deployments of cloud infrastructure
Design, develop and distribute APIs to support Infrastructure as Code (IaC) automation and deployment workflows.
Responsible for contributing to multiple source code projects to fulfill NVIDIA requirements with software services
Work and collaborate with engineering managers, architects, designers, and frontend engineers to deliver high quality software
Automate the validation of software solutions with unit and integration tests
Innovate with other engineers on proposed designs and product direction
Openly share successes and failures in a no blame environment
Requirements
BS in Computer Science, Information Systems, Computer Engineering or equivalent experience.
8+ years of professional experience.
3–5 years of hands-on experience in large-scale software development using modern languages and frameworks.
Strong proficiency in Golang for developing Kubernetes operators, controllers, and custom tools.
Proven experience building, deploying, and scaling services on Kubernetes, including work with CRDs and auto-scaling infrastructure.
Expertise with cloud-native infrastructure and managed Kubernetes services across AWS, GCP, Azure, and OCI.
Demonstrated ability to collaborate with cross-functional teams to deliver performant, reliable cloud services at scale.
Experienced in participating in incident response, performing root cause analysis, and implementing preventive measures to improve reliability.
Excellent communication and troubleshooting skills across infrastructure, Kubernetes, and application runtime layers, with the ability to articulate design decisions and quality strategies clearly.