NVIDIA is defining the next era of computing by tapping into the unlimited potential of AI. This role involves owning the build, packaging, release, and CI/CD foundation for a sophisticated platform that provides secure, sandboxed runtimes for autonomous AI agents.
Responsibilities:
- Own and evolve OpenShell’s CI/CD system across GitHub Actions, self-hosted Linux amd64/arm64 runners, GPU runners, macOS runners, reusable workflows, gated e2e jobs, release canaries, and developer-facing branch checks
- Build and harden multi-architecture release pipelines for GHCR images, Helm OCI charts, Linux and macOS CLI binaries, gateway and sandbox binaries, Python wheels, Debian packages, RPM packages, Homebrew formula generation, and install scripts
- Improve release reliability for both rolling dev builds and tagged public releases, including version derivation, automatic tagging, checksums, artifact pruning, provenance, artifact attestations, and downstream package publishing
- Drive reproducible and performant builds using mise, uv, Cargo, maturin, BuildKit, Docker/Podman, sccache, native amd64/arm64 runners, Zig, osxcross, protobuf codegen, and pinned toolchains
- Own the quality gates that decide whether code is safe to merge or ship, including Rust/Python checks, license headers, markdown/docs validation, e2e label gates, Docker/Podman e2e, Kubernetes/Helm e2e, GPU e2e, and release canary coverage
- Debug difficult build and release failures across containers, registries, runners, package managers, cross-compilation toolchains, kernel/VM runtime artifacts, and CI cache behavior
- Partner with platform engineers to make OpenShell easier to install and operate across Linux, macOS, Kubernetes, Docker, Podman, GPU environments, and experimental VM/libkrun-based runtimes
- Continuously improve CI observability, failure diagnostics, workflow runtime, cache hit rates, artifact traceability, and the developer experience for contributors and maintainers
Requirements:
- Minimum of a Bachelor's degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent experience
- 8+ years of meaningful engineering experience, with strong ownership of build, release, CI/CD, developer infrastructure, or systems tooling
- Deep experience with GitHub Actions or similar CI systems, including reusable workflows, self-hosted runners, permissions, secrets, workflow gates, matrix builds, artifact handling, and failure diagnosis
- Strong Linux systems and shell scripting skills, with the ability to debug build failures at the boundary between OS packages, containers, compilers, linkers, filesystems, and runtime environments
- Experience shipping multi-platform artifacts, including container images, Linux packages, macOS artifacts, checksums, installer scripts, and public release assets
- Working knowledge of Rust and Python build ecosystems, including Cargo, cross-compilation, Python wheels, uv, maturin, protobuf generation, and native dependency management
- Experience with Docker, BuildKit/buildx, container registries, OCI images, Helm charts, Kubernetes deployment/testing flows, and Docker/Podman compatibility concerns
- Strong understanding of supply-chain hardening: pinned actions, dependency lockfiles, release provenance, artifact checksums, SBOMs, attestations, least-privilege CI permissions, and secret hygiene
- Ability to reason about release risk, keep pipelines reliable under active development, and communicate clearly when a release should stop, continue, or be rolled back
- Experience building release systems for Rust-heavy products with Python bindings or SDKs
- Hands-on experience with native amd64/arm64 CI, GPU CI, WSL, Jetson/Tegra, CDI, or NVIDIA container workflows
- Experience with macOS cross-compilation, Homebrew formula generation, codesigning, osxcross, Zig, musl/glibc compatibility, or manylinux wheels
- Familiarity with Debian, RPM, Snap, systemd user services, or packaging products that install local daemons and helper binaries
- Track record reducing CI cost and latency through cache strategy, workflow decomposition, runner selection, and build graph simplification