NVIDIA is looking for an expert software engineer to help deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. The role involves designing, developing, and optimizing math libraries while collaborating with internal and external partners to meet their requirements.
Responsibilities:
- Design modern, flexible, and easy to use APIs and kernels for math libraries and lead design reviews with all collaborators
- Work closely with internal (e.g., Engineering, Product Management) and external partners such as researchers to understand their use cases and requirements
- Work with internal and external customers to deliver timely math libraries releases
- Become a domain expert by continuously surveying current trends in software systems
Requirements:
- PhD or MSc degree in Computer Science, Applied Math, or a related science or engineering field is preferred (or equivalent experience)
- 12+ years of experience designing and developing software for high-performance computing and/or AI applications
- Advanced C++ skills, including modern design paradigms (e.g., template meta-programming, RAII)
- Parallel programming experience with CUDA, OpenCL or vector programming on CPU (AVX, NEON or similar)
- Strong collaboration, communication, and documentation habits
- Experience with ARM, RISC-V and/or x86_64 CPU architectures
- Strong background in numerical methods (e.g., FFT, numerical linear algebra)
- Programming skills with Python, and modern automation setups for both building software (e.g. cmake) as well as testing it (e.g. CI/CD, sanitizers)
- Experience with cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation toolchains and bringing existing codes to new architectures
- Background with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS
- Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos, etc