Take existing PyTorch and JAX, profile and optimize it without compromising model accuracy, reproducibility, and robustness with respect to scientific objectives.
Develop with frameworks like CUDA, Triton, Warp, etc. to accelerate performance critical code sections.
Liaise with NVIDIA to represent our needs and implement their tooling in our environment.
Work day-to-day with scientists to identify areas of greatest impact, including travel to our SF and NY working groups to collaborate.

Engineer with at least two years professional experience in GPU optimization.
Deep understanding of GPU programming fundamentals.
Solid track record of observable artifacts (e.g., GitHub) showing optimization work.
Experience collaborating on software projects across multi-person teams.

Machine Learning Research Engineer – GPUs

Key skills