Home
Jobs
Saved
Resumes
Senior DL Algorithms Engineer – Inference Performance at NVIDIA | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Senior DL Algorithms Engineer – Inference Performance
NVIDIA
Remote
Website
LinkedIn
Senior DL Algorithms Engineer – Inference Performance
United States
Full Time
4 days ago
$184,000 - $356,500 USD
H1B Sponsor
Apply Now
Key skills
Microservices
PyTorch
C++
C
AI
Deep Learning
LLM
Performance Optimization
About this role
Role Overview
Implement language and multimodal model inference as part of NVIDIA Inference Microservices (NIMs).
Contribute new features, fix bugs and deliver production code to TRT-LLM, NVIDIA’s open-source inference serving library.
Profile and analyze bottlenecks across the full inference stack to push the boundaries of inference performance.
Benchmark state-of-the-art offerings in various DL models inference and perform competitive analysis for NVIDIA SW/HW stack.
Collaborate heavily with other SW/HW co-design teams to enable the creation of the next generation of AI-powered services.
Requirements
PhD in CS, EE or CSEE or equivalent experience.
5+ years of experience.
Strong background in deep learning and neural networks, in particular inference.
Experience with performance profiling, analysis and optimization, especially for GPU-based applications.
Proficient in C++, PyTorch or equivalent frameworks.
Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.
Proven experience with processor and system-level performance optimization.
Deep understanding of modern LLM architectures.
Strong fundamentals in algorithms.
GPU programming experience (CUDA or OpenCL) is a plus
Tech Stack
Microservices
PyTorch
Benefits
equity
benefits
Apply Now
Home
Jobs
Saved
Resumes