Modular is on a mission to revolutionize AI infrastructure by rebuilding the AI software stack. As a GenAI Systems Engineer, you'll architect robust frameworks and optimize advanced inference processes to enhance the Modular platform for AI model deployment.

Responsibilities:

Leverage a broad understanding of available libraries and concurrency techniques to inform high impact architecture decisions
Identify and implement architecture-level optimizations in complex distributed systems
Architect and implement building blocks and APIs to accelerate the development of advanced distributed optimizations
Lead cross-functional projects spanning multiple teams and multiple layers of a deep tech stack
Build beautiful abstractions to seamlessly weave async RESTful layers with intensive data processing layers
Collaborate with cloud inference team to maximize flexibility in scalable cluster deployments
Develop extensible customization interfaces to support open source community models and features
Develop detailed and intuitive metrics, logging, and profiling tools

Requirements:

Expert-level Python programming with deep understanding of asyncio and event loops
5+ years of systems programming experience with focus on performance and concurrency
Hands on experience with robust low-latency applications running production workloads
Extensive experience designing software architecture, interfaces, and collaboration
Deep understanding of the fundamentals of profiling, benchmarking, and performance optimization
Creativity and curiosity for learning and solving complex distributed systems problems
Experience working inside high-performance ML inference systems (e.g. vLLM, SGLang, etc.)
Experience with Kubernetes, containers, microservices, and cloud-native architectures
Experience with graph based (e.g. dataflow, actors) programming models and runtimes
Experience with distributed runtimes such as Ray, Open MPI, Dask, Spark, etc

GenAI Systems Engineer

Key skills

About this role

Responsibilities:

Requirements: