NVIDIA is a leader in high-performance computing platforms powering the AI revolution across various applications. They are seeking a Senior Software Engineer to develop core components of the CUTLASS platform and collaborate with teams to enhance GPU hardware features for high performance.
Responsibilities:
- Develop core components of the CUTLASS platform including Tensor Core MMAs, copies, synchronization barriers, schedulers, and other GPU hardware features in CUDA C++ and CUTLASS Python DSL
- Contribute to the advancement of the MLIR-based backend compiler stack for the CUTLASS Python DSL by designing dialects and associated compiler passes
- Author example kernels utilizing CUTLASS abstractions to showcase the use of novel GPU hardware features that are crucial for achieving high performance
- Collaborate with GPU architecture, CUDA, and NVVM/PTX compiler teams to provide feedback on programming models and to assess the performance of future GPU hardware features