Amazon Web Services (AWS) is seeking a Software Development Engineer for the Neuron Foundation Tools Team. The role involves developing and maintaining high-performance monitoring and profiling tools for machine learning applications, focusing on optimizing AI workloads and collaborating with cross-functional teams to enhance performance analysis tools.
Responsibilities:
- Develop and maintain high-performance monitoring and profiling tools for machine learning applications and AI accelerators
- Work on design, development, and deployment of the Neuron Profiler and other Neuron Tools
- Manage the full development life cycle of the Neuron Profiler/Tools toolchain, ensuring scalability, reliability, and usability
- Collaborate with cross-functional teams to ensure that the C++ compiler and runtime generates key information for performance optimization
- Drive innovations that allow the profiler to support multiple frameworks, such as PyTorch, JAX, and XLA
- Work with executive leadership and other senior management and technical leaders to define product directions and deliver them to customers