Scale AI is dedicated to solving the data bottleneck across Robotics, Autonomous Vehicles, and Computer Vision. The ML Systems Engineer will design and build platforms for scalable and efficient serving of foundation models tailored for physical agents, while collaborating with researchers and engineers to optimize models for production environments.
Responsibilities:
- Build & Scale: Maintain fault-tolerant, high-performance systems for serving robotics-related models and foundation models at scale, ensuring low latency for real-time applications
- Platform Development: Build an internal platform to empower model capability discovery, enabling faster iteration cycles for research teams working on robotics
- Collaborate: Work closely with Robotics researchers and Computer Vision engineers to integrate and optimize models for production and research environments
- Design Excellence: Conduct architecture and design reviews to uphold best practices in system scalability, reliability, and security
- Observability: Develop monitoring and observability solutions to ensure system health and real-time performance tracking of model inference
- Lead: Own projects end-to-end, from requirements gathering to implementation, in a fast-paced, cross-functional environment