Lime is the largest global shared micromobility business, and they are seeking a high-impact Senior MLOps & Data Systems Engineer to help build and scale the core data and machine learning infrastructure for the Lime Vision team. The role focuses on designing and developing systems and workflows for reliable and scalable model development, evaluation, and deployment, while collaborating with cross-functional teams to enhance model performance in real-world conditions.
Responsibilities:
- Design, build, and maintain scalable pipelines that span data ingestion, annotation, validation, training, evaluation, and deployment, ensuring reproducibility, consistency, and traceability across the full ML lifecycle
- Build and integrate annotation workflows with upstream data ingestion and training systems, enabling efficient task creation, labeling, QA, and dataset updates that directly support model iteration
- Analyze model performance and failures, and drive targeted data improvements by connecting production signals, data mining, and annotation workflows into continuous feedback loops
- Implement systems for experiment tracking, dataset versioning, and model lineage to enable reliable comparison and iteration across experiments
- Develop and maintain CI/CD workflows tailored to ML systems, enabling automated testing, validation, and deployment of models and pipelines
- Collaborate with embedded and platform teams to support the deployment of models to edge environments, ensuring compatibility, performance, and reliability
- Implement monitoring, logging, and feedback systems to track model performance in production and drive continuous improvement through data and model iteration
- Optimize training and inference workflows across cloud environments, including efficient utilization of GPU and compute resources
- Work closely with applied scientists, embedded engineers, and data teams to ensure alignment across data workflows, model development, and deployment systems
- Participate in and improve the full ML lifecycle, from raw data ingestion and annotation through training, evaluation, deployment support, and post-deployment analysis
Requirements:
- 5+ years of industry experience in MLOps, ML infrastructure, data systems, Machine Learning Engineering, or related roles
- Strong programming skills in Python, with experience in ML frameworks such as PyTorch or TensorFlow
- Experience building and maintaining end-to-end ML pipelines, including data ingestion, annotation, training, evaluation, and deployment workflows
- Experience designing or integrating annotation and data curation workflows, and understanding how labeled data impacts model performance
- Strong understanding of dataset versioning, data lineage, and reproducibility in machine learning systems
- Experience with experiment tracking and model lifecycle management
- Familiarity with CI/CD tools (e.g., GitHub Actions, GitLab CI, Jenkins) and applying them to machine learning workflows
- Experience with containerization (Docker) and workflow orchestration systems
- Experience with cloud-based ML environments (e.g., AWS) and distributed training workflows
- Strong understanding of real-world data challenges, including noisy inputs, edge cases, and variability across environments
- Strong problem-solving and debugging skills, particularly in complex, multi-stage systems
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field (or equivalent practical experience)
- Experience supporting computer vision or perception systems
- Familiarity with annotation platforms (e.g., Labelbox) and large-scale labeling workflows
- Experience with experiment tracking tools (MLflow, Weights & Biases, or similar)
- Experience with workflow orchestration frameworks (Airflow, Argo, Prefect, or Kubeflow)
- Experience with dataset versioning and data-centric ML approaches
- Experience supporting edge or embedded ML deployment
- Experience working with multi-modal data (e.g., camera, IMU, GPS)