Amazon Web Services (AWS) is seeking a Software Engineer II to join the Annapurna Labs team, which focuses on building the AWS Neuron SDK for deep learning and GenAI workloads. The role involves architecting and implementing features, optimizing machine learning models for custom hardware accelerators, and collaborating with cross-functional teams to enhance performance and efficiency.
Responsibilities:
- Design, develop, and optimize machine learning models and frameworks for deployment on custom ML hardware accelerators
- Participate in all stages of the ML system development lifecycle including distributed computing based architecture design, implementation, performance profiling, hardware-specific optimizations, testing and production deployment
- Build infrastructure to systematically analyze and onboard multiple models with diverse architecture
- Analyze and optimize system-level performance across multiple generations of Neuron hardware
- Conduct detailed performance analysis using profiling tools to identify and resolve bottlenecks
- Conduct comprehensive testing, including unit and end-to-end model testing with continuous deployment and releases through pipelines
- Work directly with customers to enable and optimize their ML models on AWS accelerators
- Collaborate across teams to develop innovative optimization techniques