Capital One is an industry leader in using machine learning to create real-time, personalized customer experiences. They are seeking a Sr. Distinguished Machine Learning Engineer to define and drive the technical strategy for their Personalization Platform, collaborate with cross-functional teams, and develop robust ML infrastructure to enhance customer interactions.

Responsibilities:

Define and drive technical strategy and roadmap for our Personalization Platform that powers real-time, personalized product experiences and multi-channel targeted user messaging across all Capital One products and services
Partner cross-functionally with Product, Data science, Cloud infrastructure, and Machine learning platform teams to align on and co-develop the advanced recommendation systems and algorithms serving our Capital One users
Develop and maintain a flexible, scalable rules engine to enable business-driven personalization logic, allowing dynamic configuration of user segmentation, targeting rules, and real-time decisioning while integrating seamlessly with ML-driven recommendations
Design, build and maintain robust ML infrastructure and pipelines to support end-to-end workflows including feature extraction, model training, testing, guardrails, model evaluation, deployment, and both real-time and batch inference - ensuring high performance, scalability, and reliability
Architect low-latency, event-driven systems for enabling real-time dynamic personalization and decisioning based on streaming data, user behavior, and contextual signals
Drive the evolution of MLOps practices by building automated metrics-backed deployment workflows, integration validation and testing systems, and scalable monitoring & observability
Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
Provide organizational technical leadership to influence architecture, engineering standards, cross-team strategies, mentoring engineers and driving organization wide platform innovation

Requirements:

Bachelor's degree
At least 10 years of experience designing and building data-intensive solutions using distributed computing
At least 7 years of experience programming in C, C++, Python, or Scala
At least 4 years of experience with the full ML development lifecycle using modern technology in a business critical setting
8+ years of experience deploying scalable, responsible AI solutions on major cloud platforms (AWS, GCP, Azure)
Master's or PhD in Computer Science or a relevant technical field
5+ years of proven expertise in designing, implementing and scaling personalization platform and recommendation systems serving one or more areas of Feed Personalization/Ads Ranking/Targeted Marketing Messaging
5+ years of strong proficiency in Python, Java, C++, or Golang
hands-on experience with ML frameworks (PyTorch, TensorFlow) and orchestration tools (Databricks, Airflow, Kubeflow)
5+ years of experience developing and applying state-of-the-art techniques for optimizing training and inference systems to improve hardware utilization, latency, throughput, and cost
5+ years of deep expertise in cloud-native engineering, containerization (Docker, Kubernetes), and automated CI/CD deployment
Passion for staying on top of the latest AI research and AI systems, and judiciously apply novel techniques in production
Excellent communication and presentation skills, with the ability to articulate complex AI concepts to peers
Proven leadership in driving platform strategy, fostering cross-functional collaboration, and influencing technical direction across the company

Sr. Distinguished Machine Learning Engineer (Remote-Eligible)

Key skills

About this role

Responsibilities:

Requirements: