Zoom is dedicated to building a world-class inference infrastructure that powers all of its AI services. As an AI Software Engineer on Zoom’s AI Infra team, you will design, optimize, and scale the runtimes and services that power AI models, improving efficiency and reducing costs across Zoom’s AI stack.

Responsibilities:

Develop and optimize AI runtimes for LLMs, ASR, and MT systems with a focus on performance and cost efficiency
Apply GPU-level optimization techniques including CUDA, kernel fusion, and memory throughput improvements
Implement inference optimizations such as TorchCompile, graph optimization, KV cache, and continuous batching
Build scalable, highly available infrastructure services to support enterprise-grade AI workloads
Optimize models for edge devices (laptops, PCs and mobile devices) as well as large-scale cloud deployments
Continuously improve latency, throughput, and efficiency across serving pipelines
Rapidly integrate and optimize new industry models to stay ahead in AI infrastructure

AI Software Engineer

Key skills

About this role

Responsibilities: