Design, build, and maintain scalable, secure, and high-performance data pipelines on GCP.
Develop and manage ETL/ELT workflows using Python, SQL, Cloud Composer, and DataProc.
Create and optimize enterprise-grade data models and storage solutions using BigQuery.
Partner with Data Science teams to support model development, experimentation, feature engineering, and ML pipeline deployment, including leveraging Vertex AI.
Set up and maintain CI/CD pipelines using GitHub Actions, ensuring automated testing, versioning, and deployment of data workflows.
Use GKE to orchestrate containerized data applications and services.
Implement best practices in data quality, version control, monitoring, and performance tuning.
Collaborate with cross-functional teams (Product, Engineering, Data Science, Business) to translate data needs into technical solutions.
Ensure data governance, security, and compliance with enterprise and regulatory standards.

4–7 years of experience as a Data Engineer or similar role, with strong hands-on work in GCP.
Expertise in Python for data processing, automation, and pipeline development.
Strong proficiency in SQL, including building complex queries and optimizing performance.
Hands-on experience with core GCP data engineering services:
BigQuery (data modeling, performance optimization, storage practices)
Cloud Composer (Airflow orchestration)
DataProc (Spark workloads)
GKE (Kubernetes-based deployment)
Experience working with Vertex AI, supporting Data Science workflows such as training pipelines, model registry, feature store, and deployment.
Experience in CI/CD setup using GitHub Actions for data and ML pipelines.
Strong understanding of data architecture, schema design, and distributed data systems.
Ability to work closely with Data Scientists and translate analytical needs into engineering solutions.
Excellent communication and problem-solving skills, with the ability to work in a fast-paced, collaborative environment.

GCP Data Engineer

Key skills