NVIDIA is looking for an expert software engineer to help deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. The role involves designing, developing, and optimizing math libraries while collaborating with internal and external partners to meet their requirements.

Responsibilities:

Design modern, flexible, and easy to use APIs and kernels for math libraries and lead design reviews with all collaborators
Work closely with internal (e.g., Engineering, Product Management) and external partners such as researchers to understand their use cases and requirements
Work with internal and external customers to deliver timely math libraries releases
Become a domain expert by continuously surveying current trends in software systems

Requirements:

PhD or MSc degree in Computer Science, Applied Math, or a related science or engineering field is preferred (or equivalent experience)
12+ years of experience designing and developing software for high-performance computing and/or AI applications
Advanced C++ skills, including modern design paradigms (e.g., template meta-programming, RAII)
Parallel programming experience with CUDA, OpenCL or vector programming on CPU (AVX, NEON or similar)
Strong collaboration, communication, and documentation habits
Experience with ARM, RISC-V and/or x86_64 CPU architectures
Strong background in numerical methods (e.g., FFT, numerical linear algebra)
Programming skills with Python, and modern automation setups for both building software (e.g. cmake) as well as testing it (e.g. CI/CD, sanitizers)
Experience with cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation toolchains and bringing existing codes to new architectures
Background with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS
Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos, etc

Senior Math Libraries Engineer, CPU and GPU Optimization, Senior Math Libraries Engineer, CPU and GPU Optimization

Key skills

About this role

Responsibilities:

Requirements: