About this role

Amazon Web Services (AWS) is focused on building AWS Neuron, a software development kit for accelerating deep learning and GenAI workloads. The role involves architecting and implementing business-critical features while mentoring a team of engineers, optimizing machine learning models for AWS's custom hardware accelerators.

Responsibilities:

Design, develop, and optimize machine learning models and frameworks for deployment on custom ML hardware accelerators
Participate in all stages of the ML system development lifecycle including distributed computing based architecture design, implementation, performance profiling, hardware-specific optimizations, testing and production deployment
Build infrastructure to systematically analyze and onboard multiple models with diverse architecture
Design and implement high-performance kernels and features for ML operations, leveraging the Neuron architecture and programming models
Analyze and optimize system-level performance across multiple generations of Neuron hardware
Conduct detailed performance analysis using profiling tools to identify and resolve bottlenecks
Implement optimizations such as fusion, sharding, tiling, and scheduling
Conduct comprehensive testing, including unit and end-to-end model testing with continuous deployment and releases through pipelines
Work directly with customers to enable and optimize their ML models on AWS accelerators
Collaborate across teams to develop innovative optimization techniques

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Key skills

About this role

Responsibilities: