Lemurian Labs is reimagining the foundations of computing to make AI accessible to everyone. They are seeking a Runtime Engineer to design and build a multi-target runtime for their AI compiler stack, focusing on low-level parallelization, kernel scheduling, and performance analysis.
Responsibilities:
- Design, develop, maintain, and improve our multi-target runtime
- Apply the latest techniques in parallelization and partitioning to automate kernel generation and exploit highly optimized execution paths
- Rapidly prototype and data-drive exploration of new runtime ideas
- Benchmark and analyze the outputs produced by our optimizing compiler on target hardware
- Build tools to collect and analyze performance bottlenecks
- Work closely with our product team to understand the evolving needs of ML engineers and drive improvements in runtime architecture