
Must-Haves:
Google Cloud Platform
SPARK
Airflow
SQL
Google Cloud Platform Data services
AI/ML IS A super nice to have
Job description: KEY RESPONSIBILITIES:
Design and build scalable ETL/ELT pipelines using Apache Airflow, Apache Spark, and Google Cloud Platform Dataflow
Develop and maintain BigQuery data models, schemas, and performance-optimized SQL queries
Build and maintain data pipelines feeding AI/ML feature stores and forecasting models
Collaborate with AI Developers to ensure high-quality, low-latency data access for model training
Manage and optimize Cloud Composer DAGs and pipeline orchestration
Implement data quality monitoring, alerting, and lineage tracking
Participate in data platform architecture decisions and documentation
REQUIRED QUALIFICATIONS:
3+ years (Intermediate) or 5+ years (Specialist) of data engineering experience
Hands-on experience with Apache Airflow for pipeline orchestration
Proficiency in Apache Spark for large-scale data processing
Strong SQL skills including complex query optimization and BigQuery-specific capabilities
Experience with Google Cloud Platform data services: BigQuery, Cloud Storage, Pub/Sub, Dataflow
Solid understanding of ETL/ELT patterns and data warehousing principles
PREFERRED QUALIFICATIONS:
Google Cloud Platform Professional Data Engineer certification
Experience supporting ML/AI data infrastructure (feature engineering, training datasets)
Familiarity with real-time streaming (Kafka, Dataflow/Flink)
Retail or large-scale consumer data experience