Amazon Web Services (AWS) is seeking a Senior Software Development Engineer to join the Annapurna Labs team, which builds the AWS Neuron SDK for accelerating deep learning workloads. This role involves architecting and implementing critical features, mentoring engineers, and optimizing machine learning models for performance on AWS's custom ML accelerators.
Responsibilities:
- Help lead the efforts in building distributed inference support for Pytorch in the Neuron SDK
- Tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and servers
- Collaborate across compiler, runtime, framework, and hardware teams to optimize machine learning workloads for our global customer base
- Work at the intersection of software, hardware, and machine learning systems
- Bring expertise in low-level optimization, system architecture, and ML model acceleration
- Develop and performance tune a wide variety of LLM model families, including 500B+ large language models like the Llama family, DeepSeek and beyond
- Work side by side with performance, compiler and runtime engineers to create, build and tune distributed inference solutions with Trainium and Inferentia
- Build infrastructure to systematically analyze and onboard multiple models with diverse architecture
- Collaborate with performance team to enable and evaluate optimizations such as fusion, sharding, tiling, and scheduling etc
- Conduct comprehensive testing, including unit and end-to-end model testing with continuous deployment and releases through pipelines
- Work directly with customers to enable and optimize their ML models on AWS accelerators
- Collaborate across teams to develop innovative optimization techniques
- Build online/offline inference serving with vLLM, SGLang, TensorRT or similar platforms in production environments
- Debugging performance issues, optimizing memory usage, and shaping the future of Neuron's inference stack across Amazon and the Open Source Community
- Create metrics, implement automation and other improvements, and resolve the root cause of software defects
- Build high-impact solutions to deliver to our large customer base and participate in design discussions, code review, and communicate with internal and external stakeholders
- Work cross-functionally to help drive business decisions with your technical input