AirflowApacheBigQueryCloudETLGoogle Cloud PlatformPythonSQLELTApache AirflowGCPGoogle CloudCloud FunctionsPub/SubCloud StoragecomposerGitVersion Control
About this role
Role Overview
Design, develop, and maintain ETL/ELT pipelines to ingest, transform, and load large datasets into GCP-based platforms (e.g., BigQuery, Cloud Storage).
Optimise data pipelines for performance, reliability, and scalability.
Develop and manage data models, schemas, and storage solutions aligned with best practices.
Leverage GCP services such as Cloud Composer, Dataflow, Pub/Sub, and Cloud Functions to build automated workflows.
Implement data validation, cleansing, and quality checks to maintain accuracy and integrity.
Collaborate with data scientists, analysts, and business stakeholders to define and execute data requirements.
Set up monitoring systems to track pipeline performance and ensure timely delivery.
Requirements
Proficiency in Python for building and deploying data processing scripts.
Strong expertise in GCP services, especially BigQuery, Cloud Storage, Cloud Functions, Cloud Composer, and Pub/Sub.
Experience with SQL for querying and processing data.
Familiarity with workflow orchestration tools like Apache Airflow.