Build & Operate Large-Scale Feature Pipelines: Design and maintain batch/streaming pipelines (Spark, Flink, Databricks, Airflow) producing ML features for ranking models.
Ensure Point-in-Time Correctness: Develop feature sets that enable unbiased offline training and credible online inference.
Develop Embedding & Content Pipelines: Build scalable workflows for metadata, imagery, and multimodal representations; partner with Science teams to operationalize new models.
Architect Data Foundations: Design Delta/Parquet data models and medallion layers, optimizing storage layout and partitioning for latency and cost.
Real-Time Engineering: Build Kafka-based systems for real-time features and user-activity aggregations, ensuring robust handling of out-of-order events and exactly-once semantics.
Governance & Leadership: Define data quality rules and schema evolution processes while collaborating across ML pods to translate model needs into infrastructure.
Requirements
7+ years of experience in large-scale data or software engineering
Deep experience with Spark (PySpark/Scala), Databricks, Airflow, and Kafka.
Proficiency in feature pipelines, temporal joins, and mitigating training-serving skew.
Experience with AWS/Azure/GCP and high-performance engines like Snowflake or Redshift.
Proficient programming skills in Python and SQL with a focus on performance optimization.
Experience in personalization domains (search, ranking, or recommender systems).
Experience supporting petabyte-scale data lakehouses or feature stores.
Familiarity with GenAI/RAG systems, multimodal content, or Delta Live Tables.
Knowledge of Causal Inference, experimentation signals, or ML evaluation workflows.
Experience with Terraform for governed, repeatable deployments.