Stitch Fix is the leading online personal styling service that helps people discover styles that fit perfectly. The Manager of Data & AI Platform Engineering will lead the organization managing engineers on core data, machine learning, and generative AI platforms, contributing to the technical execution for AI-powered, data-driven experiences across the company.

Responsibilities:

Lead in a player-coach capacity in execution for Stitch Fix’s next-gen Data, ML, and GenAI platforms - building a unified, secure and scalable architecture for semantic search, retrieval-based intelligence, multi-model orchestration, and agent automation, while operationalizing GenAI through safe, performant, and production-ready systems that power real-world client and employee experiences
Contribute towards modernization of data and ML foundations to support unified signals, adaptive models, experimentation velocity, and scalable AI/ML workloads
Provide foundational APIs, SDKs, frameworks, and self-service tools that make it easy for data scientists, ML engineers, analysts, and application teams to build and deploy AI solutions quickly, safely, and at scale
Partner with Data Science, Engineering, and Product teams to translate Data/ML/GenAI platform capabilities into production-grade features and intelligent experiences that deliver measurable business value
Drive responsible AI and data adoption by creating reusable templates, documentation, and enablement programs, and by partnering closely with technology and business teams to identify and prioritize high-impact opportunities for personalization, automation, and intelligence
Contribute towards improving governance practices including data contracts, lineage, metric definitions, access policies, and responsible AI guardrails - for trust, safety, and compliance
Ensure operational excellence through platform reliability, performance, observability, cost efficiency, and simplification of legacy systems
Lead and develop high-performing engineering teams fostering a culture of clarity, excellence, and trust
Balance speed of innovation with platform stability, ensuring engineering efforts are tightly aligned to business priorities and long-term client value

Requirements:

5+ years in software, data, ML, or platform engineering; 1+ years leading engineering individual contributors is a plus
Demonstrated success contributing towards large-scale data platforms, ML platforms, or AI/GenAI platforms in cloud environments
Experience delivering platform modernization, unification, and multi-year architectural transformation
Strong software engineering foundation, with experience designing and building large-scale distributed systems and resilient, high-quality APIs and services using modern programming languages and cloud-native architectures
Track record operating and evolving modern data infrastructure, including some of the following: distributed compute and storage technologies (Spark, Trino, Iceberg), real-time processing frameworks (Kafka/Flink), metadata / catalog systems, and Kubernetes-based orchestration
Expertise across the ML lifecycle - feature engineering, training pipelines, model deployment and serving, monitoring, validation, fine-tuning, and MLOps best practices
Proven capability in building self-service platform abstractions and tooling that enable teams to develop, experiment, and deploy data and ML products efficiently
Experience with modern GenAI architectures - semantic retrieval, knowledge-grounded indexing, LLM orchestration, agent workflows, and evaluation frameworks
Familiarity with modern ML frameworks like PyTorch and Ray is a plus
Strategic thinker able to align platform investments with business priorities and emerging AI opportunities
Potential to be a strong people leader with a track record of contributing to make inclusive, high-performing engineering teams
Excellent communicator who can influence both technical and business stakeholders across domains

Manager, Data & AI Platform Engineering

Key skills

About this role

Responsibilities:

Requirements: