Design, build, and maintain scalable, secure, and high-performance data pipelines on GCP.
Develop and manage ETL/ELT workflows using Python, SQL, Cloud Composer, and DataProc.
Create and optimize enterprise-grade data models and storage solutions using BigQuery.
Partner with Data Science teams to support model development, experimentation, feature engineering, and ML pipeline deployment, including leveraging Vertex AI.
Set up and maintain CI/CD pipelines using GitHub Actions, ensuring automated testing, versioning, and deployment of data workflows.
Use GKE to orchestrate containerized data applications and services.
Implement best practices in data quality, version control, monitoring, and performance tuning.
Collaborate with cross-functional teams (Product, Engineering, Data Science, Business) to translate data needs into technical solutions.
Ensure data governance, security, and compliance with enterprise and regulatory standards.
Requirements
4–7 years of experience as a Data Engineer or similar role, with strong hands-on work in GCP.
Expertise in Python for data processing, automation, and pipeline development.
Strong proficiency in SQL, including building complex queries and optimizing performance.
Hands-on experience with core GCP data engineering services: