10Beauty is a robotics startup based in Burlington, MA, focused on building the world’s first fully autonomous manicure machine. They are seeking a Senior DevOps Engineer to build, automate, and maintain the infrastructure for their robotic machines, while collaborating with engineering teams to define their DevOps and MLOps strategy.
Responsibilities:
- Build & Maintain CI/CD Pipelines: Architect and manage robust CI/CD pipelines for both our embedded software and our machine learning models, ensuring seamless integration, testing, and deployment
- Fleet Management & OTA Updates: Develop and maintain the infrastructure for secure and scalable Over-the-Air (OTA) firmware updates to our fleet of robotic machines deployed in the field
- Infrastructure as Code (IaC): Manage and provision our cloud infrastructure (AWS), ensuring it is repeatable, scalable, and secure
- Containerization & Orchestration: Deploy and manage containerized applications using Docker and Kubernetes
- Monitoring & Alerting: Implement and maintain comprehensive monitoring, logging, and alerting systems to ensure the health, performance, and reliability of our entire stack, from the cloud to the robotic machines themselves
- Collaboration & Planning: Work closely with our ML, embedded, and software engineering teams helping to define our long-term DevOps and MLOps strategy
Requirements:
- 5-8 years of professional experience in a DevOps, SRE, or similar role
- Hardware/robotics experience: You have experience with the unique challenges of embedded systems, managing OTA updates, and deploying software to physical devices in the field
- Strong proficiency with cloud platforms: AWS, GCP, or Azure (AWS preferred)
- Expertise with CI/CD tools: Experience with Jenkins, GitLab CI, GitHub Actions, or similar platforms
- Proficiency in containerization: Strong skills with Docker and Kubernetes
- Hands-on experience with IaC: Terraform is a plus
- Excellent scripting skills: Python or Bash
- Experience with monitoring tools: Prometheus, Grafana, Datadog, or similar
- A proactive, problem-solving mindset and a passion for building scalable, reliable systems
- MLOps: Automate the end-to-end ML lifecycle, including model versioning, training pipeline automation, model serving, and continuous monitoring of model performance in production
- MLOps expertise: A proven track record of building and managing pipelines for the deployment and monitoring of machine learning models in a production environment