Life360 is a company dedicated to keeping families connected and safe through innovative mobile applications and tracking devices. They are seeking a Senior Machine Learning Operations Engineer II to design and manage the infrastructure and automated pipelines for machine learning models, ensuring their reliable deployment and monitoring in production environments.

Responsibilities:

Pipeline Automation: Design, implement, and manage automated CI/CD and Continuous Training (CT) pipelines for machine learning model development, evaluation, and delivery
Model Deployment: Containerize, deploy, and scale machine learning models as high-availability microservices or batch processing workflows
Observability & Monitoring: Establish unified logging, alerting, and monitoring solutions to track model inference performance, system latency, resource utilization, data drift, and concept drift
Infrastructure Management: Provision and optimize cloud-based ML infrastructure (including GPU/CPU computing clusters) utilizing Infrastructure as Code (IaC) paradigms
Cross-Functional Collaboration: Work intimately with product development teams to drive infrastructure adoption and efficiency gains through SDK/API development, automation and efficient ML system maintenance
Governance & Compliance: Implement robust lineage tracking for data, code, and model artifacts to ensure compliance, reproducibility, and security across the entire ML lifecycle
Data Infrastructure & Tooling: Work with data engineering to improve the data ecosystem, ensuring robust, scalable pipelines for experimentation and ML (including streaming tools like Kafka and Flink for low-latency online inference)
Thought Leadership: Act as a mentor and thought leader, helping to define best practices in machine learning engineering, scalable ML service ops, and agentic AI (AI-Native) best practices

Requirements:

5+ years of professional software engineering, DevOps, or data engineering experience, with at least 2 years dedicated to building and maintaining MLOps infrastructure
Strong proficiency in Python, including deep familiarity with software engineering best practices (unit testing, modular design, version control via Git)
Hands-on experience with containerization (Docker) and container orchestration platforms, specifically Kubernetes (EKS, GKE, or native clusters), experience with related tools like FastAPI
Proven familiarity with specialized ML lifecycle and data processing tools and platforms such as MLflow, Kubeflow, SparkML, Synapse ML, SQL, Spark/PySpark, dbt, and Airflow
Practical experience operating within a major cloud ecosystem—e.g., AWS, GCP, Databricks—with a clear grasp of cloud networking, security, and storage tiers
Strong communication and project leadership skills, with the ability to influence cross-functional teams
Bachelor's or Master's degree in Computer Science, Data Science, Software Engineering, or a closely related quantitative field
Experience implementing and scaling production feature stores (e.g., Feast, Tecton) and model registries
Prior experience deploying and optimizing Large Language Models (LLMs) or foundation models utilizing serving frameworks like vLLM, Triton Inference Server, or TGI
Proficient with IaC frameworks, particularly Terraform, to manage reproducible environments
Familiarity with distributed data computation engines such as Apache Spark, Ray, or Dask
Relevant cloud or architecture credentials, such as AWS Certified Machine Learning Specialty, Google Cloud Professional Machine Learning Engineer, or Certified Kubernetes Administrator (CKA)
Experience in subscription-based products, lifecycle marketing, or user acquisition
Experience with geospatial data and mobile location-based services
Experience in the consumer technology sector, particularly within a fast-paced and sometimes ambitious development setting

Senior Machine Learning Operations Engineer II (AI Native)

Key skills

About this role

Responsibilities:

Requirements: