Stripe is a financial infrastructure platform for businesses, aiming to increase the GDP of the internet. The role involves working with machine learning engineers and data scientists to build scalable ML infrastructure services that enhance ML development and operational capabilities across the company.
Responsibilities:
- Designing and building scalable, reliable, and secure services for notebooks, ML model training, experimentation, serving, and LLM applications across multiple regions
- Creating services and libraries that enable ML engineers at Stripe to seamlessly transition from experimentation to production across Stripe’s systems
- Working directly with product teams and ML engineers to improve their day-to-day productivity
- Taking ownership of and finding solutions for technical and product challenges by working with a diverse set of systems, processes, and technologies
Requirements:
- 2+ years of professional software development experience with a solid background on service oriented architecture and large-scale distributed systems
- Experience working through the full life cycle of software development, from talking to users, to design and implementation, to testing and deployment, to operations
- Experience working on production ML platforms, MLOps solutions, or building LLM applications
- Experience running operations for high availability, low latency systems
- Experience partnering with other teams to drive business outcomes
- A sense of pragmatism: you know when to aim for the ideal solution and when to adjust course
- Experience building and shipping production AI agents
- Familiarity with the LLMs and LLM Frameworks
- Experience training and shipping machine learning models to production to solve critical business problems