Databricks is a leading data and AI company focused on enabling data teams to tackle complex problems. The Staff Backend Software Engineer will play a critical role in shaping the Model Serving product by designing and building systems for high-throughput, low-latency inference, and collaborating across various teams to optimize performance and operational efficiency.
Responsibilities:
- Design and implement core systems and APIs that power Databricks Model Serving, ensuring scalability, reliability, and operational excellence
- Partner with product and engineering leadership to define the technical roadmap and long-term architecture for serving workloads
- Drive architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and operational efficiency for CPU and GPU serving workloads
- Contribute directly to key components across the serving infrastructure — from model container builds and deployment workflows to runtime systems like routing, caching, observability, and intelligent autoscaling — ensuring smooth and efficient operations at scale
- Collaborate cross-functionally with product, platform, and research teams to translate customer needs into reliable and performant systems
- Lead technical initiatives that improve latency, availability, and cost-effectiveness across both customer-facing and foundational serving layers
- Establish best practices for code quality, testing, and operational readiness, and mentor other engineers through design reviews and technical guidance
- Represent the team in cross-organizational technical discussions and influence Databricks’ broader AI platform strategy
Requirements:
- 10+ years of experience building and operating large-scale distributed systems
- Deep expertise in model serving, inference systems, and related infrastructure (e.g., routing, scheduling, autoscaling, and observability)
- Strong foundation in algorithms, data structures, and system design as applied to large-scale, low-latency serving systems
- Proven ability to deliver technically complex, high-impact initiatives that create measurable customer or business value
- Experience leading architecture for large-scale, performance-sensitive CPU/GPU inference systems
- Strong communication skills and ability to collaborate across teams in fast-moving environments
- Strategic and product-oriented mindset with the ability to align technical execution with long-term vision
- Passion for mentoring, growing engineers, and fostering technical excellence