NVIDIA is a leader in accelerated computing, and they are seeking a Senior Software Engineer to develop and implement CUDA Core Libraries for GPU computing. This role involves working on C++ and Python libraries, optimizing GPU algorithms, and improving the developer experience for CUDA users.

Responsibilities:

Develop and implement CUDA Core Libraries in C++ and/or Python, including parallel algorithms and idiomatic language bindings for core CUDA functionality
Compose, optimize, and evolve GPU algorithms and APIs, from high-level interfaces down to low-level performance tuning involving memory, parallelism, and synchronization
Own features end-to-end: develop, implementation, testing, benchmarking, documentation, and long-term maintenance
Improve developer experience across the stack: CI, tests, benchmarks, packaging, examples, and docs
Collaborate with senior CUDA engineers in design reviews, code reviews, and open-source-style workflows
Engage with real users through issues, performance investigations, and API feedback

Requirements:

BS, MS, or PhD in Computer Science, Computer Engineering, or a related field or equivalent experience
Minimum of 8+ years of related development experience
Strong programming skills in C++, Python, or both, with proven interest in systems-level software (performance, memory, concurrency, API design)
Solid understanding of modern C++ (templates, generics, standard library) and/or Python library development and packaging
Practical experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accelerated Python, or similar)
Experience contributing to production software or open-source libraries, including testing, profiling, and code review
Ability to work independently, scope problems, and drive projects to completion
Clear written communication for technical design and documentation
Comfort navigating large, multi-language codebases (C++, Python, CMake, Pixi, CI systems)
Strong understanding of CPU/GPU architecture and how hardware details affect performance
Hands-on experience with CUDA C++, CUDA Python, PyTorch, JAX, Numba, CuPy, or similar GPU-accelerated stacks
Familiarity with Thrust, CUB, libcudacxx, or other modern C++/GPU libraries
Experience with compiler infrastructure or tooling (LLVM, Clang tooling, MLIR)
Demonstrated interest in developer tools, library design, and making other developers faster

Senior Software Engineer, CUDA Core Libraries

Key skills

About this role

Responsibilities:

Requirements: