Walmart Canada is a leading retail corporation, and they are seeking a Senior Data Scientist (Machine Learning Engineer) to join their Catalog Data Science team. The role involves developing and deploying machine learning models to enhance data quality and maintain customer trust in the Walmart marketplace, focusing on compliance detection and policy violation classification.
Responsibilities:
- Design and deploy production-grade ML systems for Walmart's Catalog Trust & Safety platform — spanning classification, detection, and segmentation
- Apply GenAI, NLP, and Computer Vision techniques to build and continuously improve models for compliance detection, content moderation, and policy violation classification
- Own the full model lifecycle — from experimentation and offline evaluation through serving, monitoring, and iterative improvement in production
- Build and optimize high-throughput batch and real-time inference pipelines using frameworks like Ray, Triton, and vLLM, with a focus on latency, cost, and reliability
- Drive ML architecture decisions — including model selection, distillation, quantization, and serving strategies
- Partner with Compliance, Product, and Operations teams to translate business requirements into model KPIs, evaluation frameworks, and measurable impact
- Establish and enforce ML engineering best practices across the team: reproducible training, robust evaluation datasets, versioned artifacts, and production readiness standards
- Contribute to the broader ML engineering community at Walmart through technical documentation, internal talks, and cross-team knowledge sharing
Requirements:
- PhD or Master's in Computer Science, or equivalent experience; 3+ years building and deploying production ML systems at scale
- Deep expertise in model serving and inference optimization — experience with Triton Inference Server, vLLM, TorchServe, or comparable frameworks
- Hands-on experience with Generative AI technologies: LLMs, multimodal models, RAG architectures, prompt engineering, and fine-tuning (LoRA/QLoRA, PEFT)
- Strong foundation in classical ML, deep learning, and modern architectures — CNNs, Transformers, and domain-specific variants
- Proven ability to build and operate large-scale batch and real-time inference pipelines handling high QPS with strict latency and throughput SLAs
- Proficiency in Python and ML ecosystem tooling — PyTorch, HuggingFace, scikit-learn, NumPy; familiarity with distributed compute frameworks (Ray, Spark)
- Experience deploying and managing ML workloads on Kubernetes; solid working knowledge of Docker, Helm, and container orchestration
- Familiarity with ML observability — model monitoring, data drift detection, performance degradation alerting, and online evaluation strategies
- Practical experience with MLOps tooling: experiment tracking (MLflow, W&B), pipeline orchestration (Airflow, Kubeflow), and CI/CD for ML
- Hands-on with at least one major cloud platform (GCP, Azure etc.) and comfort with managed ML services and GPU infrastructure
- Working knowledge of relational and NoSQL databases
- Experience with vector databases (Pinecone, Weaviate, pgvector) and hybrid retrieval systems for GenAI applications
- Experience with Version Control Systems, especially Git
- Strong verbal and written communication skills; ability to translate complex ML systems into clear technical and business narratives
- Proactive in tracking the latest AI/ML research and translating advancements into production-grade solutions
- Option 1- Bachelor's degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology, or related field and 3 years' experience in an analytics related field
- Option 2- Master's degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology, or related field and 1 years' experience in an analytics related field
- Option 3 - 5 years' experience in an analytics or related field
- Data science, machine learning, optimization models
- Master's degree in Machine Learning, Computer Science, Information Technology, Operations Research, Statistics, Applied Mathematics, Econometrics
- Successful completion of one or more assessments in Python, Spark, Scala, or R
- Using open source frameworks (for example, scikit learn, tensorflow, torch)
- We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly
- The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart's accessibility standards and guidelines for supporting an inclusive culture