TechWish is seeking a GCP Data Engineer with strong MLOps experience to build, scale, and operationalize data and ML pipelines on Google Cloud. The role involves partnering with Data Science, Product, and Platform teams to deliver reliable workflows for machine learning while ensuring operational excellence and model performance monitoring.
Responsibilities:
- Design, develop, and optimize scalable data pipelines and ML workflows on GCP with BigQuery and Spark
- Build robust ELT/ETL processes and data models supporting ML feature stores, training datasets, and production inference
- Orchestrate pipelines and jobs, enabling dependency management, retries, and observability (e.g., Airflow)
- Implement CI/CD and automation for data/ML pipelines, including packaging, versioning, and environment promotion
- Develop event-driven and micro-batch processes for real-time ML inference (e.g., via Cloud Functions) and low-latency data preparation
- Establish model performance monitoring, drift detection, data quality checks, and alerting dashboards
- Collaborate closely with Data Scientists to productionize models and establish reproducible training/inference workflows
- Enforce best practices for code quality, testing, documentation, and cost/performance optimization on GCP
- Troubleshoot production issues, drive root-cause analysis, and implement durable fixes and postmortems
Requirements:
- Hands-on experience with Google Cloud (BigQuery) in production environments
- Strong Spark expertise (data processing, optimization, and job orchestration)
- Advanced proficiency in Python and SQL for data engineering and ML pipeline development
- Demonstrated experience building and supporting production-grade data/ML pipelines
- GCP services: Airflow, gcloud (CLI), Cloud Functions
- Solid understanding of core ML concepts (training, evaluation, deployment patterns)
- ML model performance monitoring (data/feature drift, model decay, alerting, dashboards)
- Explainable AI (xAI) and LLM concepts (prompting, evaluation, guardrails)
- Real-time machine learning patterns (feature serving, low-latency inference, event-driven architectures)
- Experience with packaging, testing, and CI/CD for ML (artifact/version management, reproducibility)