General Motors is seeking an experienced Senior Machine Learning Engineer to design and build scalable AI/ML platform infrastructure. This role involves collaborating with cross-functional teams to develop advanced AI solutions for intelligent driving technologies across their vehicles.

Responsibilities:

Design and development of scalable, reliable, high-performance ML framework to support model training at scale
Model training performance analysis and optimization solutions to scale distributed training workflows and maximize resource utilization across heterogeneous hardware environments, and save cost
Raise the bar on system observability, debuggability, and operational excellence, and user experience
Collaborate with cross-functional teams to integrate new features and technologies into the platform

Requirements:

Bachelors degree or higher in Computer Science or equivalent major OR equivalent relevant experience
3+ years professional software engineering experience
2+ years specialized experience in AI/ML infrastructure, e.g., enabling distributed training for scaling large ML models
Strong programming skills in Python, with proficiency in frameworks such as, PyTorch (preferred), TensorFlow, or similar
Experience with distributed computing, GPU computing, and cloud environments (AWS, GCP, Azure)
Willingness to travel to Sunnyvale, CA as needed
Comfortable working in highly ambiguous and dynamic environments
5+ years of professional software engineering experience
Self-motivated, strong execution, impact-delivering oriented
Extensive knowledge and experience with PyTorch 2.x+ and distributed training framework
Experience with design and development of training framework that supports FSDP, Pipeline Parallelism and other scalable solutions to training large foundational models
Experience with profiling, analysis, debugging and optimizing training and data loading performance
Excellent communication skills to resolve controversial, make consensus, communicate risks and give constructive feedback

Senior Machine Learning Engineer

Key skills

About this role

Responsibilities:

Requirements: