NVIDIA is at the forefront of innovation, driving advancements in AI and machine learning. They are seeking talented engineers to join their TensorRT team, focusing on developing deep learning inference software for NVIDIA AI accelerators.
Responsibilities:
- Design, develop and optimize NVIDIA TensorRT and TensorRT-LLM to supercharge inference applications for datacenter, workstations, and PCs
- Develop software in C++, Python, and CUDA for seamless and efficient deployment of state-of-the-art LLMs and Generative AI models
- Collaborate with deep learning experts and GPU architects throughout the company to influence Hardware and Software design for inference