Adaptive ML is a frontier AI startup building a Reinforcement Learning Operations (RLOps) platform that enables enterprises to specialize and deploy LLMs into production. The Member of Technical Staff will contribute to building foundational technology, focusing on high-performance software engineering and large-scale RL research, while also engaging in systematic empirical research to drive the product roadmap.
Responsibilities:
- Develop robust software in Rust, interfacing between easy-to-use Python recipes and high-performance, distributed training code running on hundreds of GPUs
- Profile and iterate GPU inference kernels in Triton or CUDA, identifying memory bottlenecks and optimizing latency—and decide how to adequately benchmark an inference service
- Develop and execute an experiment analyzing nuances between DPO and PPO in a fair and systematic way
- Build data pipelines to support reinforcement learning from noisy and diverse user' interactions across varied tasks
- Experiment with new ways to combine adapters and steer the behavior of language models
- Build hardware correctness tests to identify and isolate faulty GPUs at scale
- Build the foundational technology powering Adaptive, with a focus on high-performance software engineering and large-scale RL research
- Contribute to our product roadmap, by identifying promising trends and high-impact findings
- Report clearly on your work to a distributed collaborative team, with a bias for asynchronous written communication
- Write high-quality software in Rust, with a focus on performance and robustness
- Profile dedicated GPU kernels in CUDA or Triton, optimizing across latency/compute-bound regimes for complex workloads
- Identify and resolve bugs in large distributed systems, at the intersection of software and hardware correctness
- Conduct research on large language models or diffusion models, systematically exploring how reinforcement learning can be used to personalize models
- Reproduce results from the RL, LLM, and diffusion literature, distinguishing the noise from the groundbreaking
- Own a research agenda, with a bias for at-scale, systematic empirical research