Home
Jobs
Saved
Resumes
Machine Learning Engineer, Training Optimization at Canva | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Machine Learning Engineer, Training Optimization
Canva
Remote
Website
LinkedIn
Machine Learning Engineer, Training Optimization
China
Full Time
5 days ago
Visa Sponsorship
Apply Now
Key skills
Python
PyTorch
Rust
C++
C
AI
Machine Learning
JAX
Communication
About this role
Role Overview
You’ll design, implement, and optimize large-scale machine learning systems for training
You’ll improve all aspects of performance, including GPU utilization, communication overhead, and memory efficiency.
You’ll partner with research and modeling teams to align systems with algorithmic needs.
You’ll evaluate and apply best practices for distributed training using industry-leading frameworks.
You’ll dive deep into low-level optimization, including custom CUDA or Triton kernels.
You’ll debug, profile, and fine-tune training workflows to unlock new levels of scalability.
Requirements
Strong background in LLMs, multimodal AI, or diffusion models.
Proficiency in Python.
Familiarity with a system programming language (e.g. C++ or Rust) is a plus.
Deep knowledge of PyTorch or JAX as well as libraries such as Megatron-LM, NeMo, or DeepSpeed.
Familiarity with common optimization techniques such as FSDP/ZeRO, gradient checkpointing, or low-precision data types.
Hands-on experience writing custom GPU kernels in CUDA or Triton.
Excellent communication and problem-solving skills, incl. full proficiency in English.
Tech Stack
Python
PyTorch
Rust
Benefits
Employees can work remotely
Apply Now
Home
Jobs
Saved
Resumes