NVIDIA is known as 'the AI computing company' and is seeking an AI & Deep Learning Compiler Engineer for its Deep Learning & AI Compiler team. The role involves analyzing deep learning networks and developing compiler optimization algorithms to enhance the performance of NVIDIA's inference engine across various applications.
Responsibilities:
- Analyzing deep learning networks and developing compiler optimization algorithms
- Collaborating with members of the deep learning software framework teams and the GPU architecture teams to accelerate the next generation of deep learning software
- Scope of these efforts includes defining public APIs, performance optimizations and analysis, crafting and implementing compiler techniques for AI workloads and future NVIDIA GPUs
Requirements:
- Bachelor's, Master's or Ph.D. in Computer Science, Computer Engineering, related field or equivalent experience
- 3+ years of relevant work or research experience in performance analysis and compiler optimizations
- Experience with compiler technologies (e.g., MLIR, LLVM, XLA, Triton, etc.)
- Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and test design
- Ability to work independently, define project goals and scope, and lead your own development efforts
- Strong interpersonal skills are required along with the ability to work in a dynamic product-oriented team
- Proficient in CPU and/or GPU architecture
- CUDA or OpenCL programming experience
- Understanding of deep learning models, algorithms and frameworks, such as PyTorch, JAX
- GPU kernel authoring and performance analysis using tools such as Nsight Compute
- A track record of success in mentoring early-career engineers and interns is a bonus
- Track record on new hardware bring-up is a plus