Allstate is a company dedicated to protecting families and their belongings from life's uncertainties. They are seeking a Machine Learning Platform Lead Engineer to architect, build, and scale core platforms for enterprise-wide machine learning solutions, providing technical leadership and collaborating with various teams to enable ML adoption across the enterprise.
Responsibilities:
- Serve as the technical lead for ML platform architecture, guiding system design, scalability, performance, and reliability across platform components
- Architect and build core ML platform services, including training and compute infrastructure, feature stores, model registries, inference runtimes, and data pipelines
- Drive architectural decisions for distributed systems, cloud‑native frameworks, and automated MLOps workflows that support enterprise-scale machine learning
- Evaluate and integrate emerging ML platform technologies, tools, and best practices to continuously strengthen platform capabilities
- Design and implement robust MLOps pipelines for experiment tracking, data and model versioning, CI/CD for ML, automated retraining, and model governance
- Develop automated workflows that ensure reproducible model training, validation, deployment, and lifecycle management across multiple environments
- Implement monitoring and observability systems for model performance, data quality, drift detection, and inference reliability
- Build and optimize cloud-based ML infrastructure on Azure, AWS, or GCP using Kubernetes, containerization, and infrastructure‑as‑code
- Develop scalable batch and streaming data pipelines using modern data engineering tools and frameworks
- Embed security, compliance, responsible AI principles, and cost optimization best practices within ML platform architecture and operations
- Collaborate with data scientists to translate modeling needs into scalable, reusable, and self-service platform capabilities
- Work closely with security, compliance, and governance teams to ensure safe and compliant deployment of AI/ML solutions
- Partner with application engineering teams to accelerate adoption of ML services and enable consistent, high-quality production deployments
- Provide technical mentorship, set engineering standards, and contribute to documentation, best practices, and ongoing platform improvements
Requirements:
- Extensive experience in ML engineering, platform engineering, or large-scale distributed systems
- Deep hands-on expertise with MLOps tools, ML frameworks, model deployment techniques, and ML lifecycle automation
- Strong proficiency in Python and backend development for machine learning systems
- Experience with cloud platforms and ML services, including Azure ML Studio, AWS SageMaker, and/or Google Vertex AI
- Exposure to cloud storage/data such as Azure Fabric/OneLake, AWS S3, and Google Cloud Storage (GCS)
- Experience with cloud-native scanning and security tools such as Azure Defender, Microsoft Purview, AWS Security Hub, Amazon Inspector, GCP Security Command Center, or equivalent services
- Strong understanding of technologies such as Kubernetes, Docker, CI/CD, Terraform/Infrastructure-as-Code, etc
- Solid knowledge of system design, APIs, data pipelines, and scalable ML infrastructure patterns
- Proven ability to lead technical initiatives and influence cross‑team engineering decisions
- 6+ years of related experience (preferred)