Reka AI is a globally distributed foundation model startup focused on building useful multimodal artificial intelligence. They are seeking an experienced GPU Performance Engineer to design and implement improvements to their training infrastructure, optimize model performance, and enhance the efficiency of their model serving infrastructure.
Responsibilities:
- Design and implement improvements to our training infrastructure
- Contribute to technical decisions that optimize performance of our models
- Work on post-training processes, including reinforcement learning and fine-tuning
- Contribute to improving the efficiency and scalability of our model serving infrastructure
Requirements:
- Strong engineering skills with fluency in Python and PyTorch (or other frameworks)
- Proven experience implementing and training large deep learning models
- Experience writing and debugging low-level GPU code (CUDA, C++)
- Experience scaling up GPU jobs using large-scale compute clusters (e.g., Slurm or Kubernetes)
- Demonstrated ability to analyze and optimize the performance of GPU-accelerated workloads, including profiling, identifying bottlenecks, and implementing performance tuning techniques