Design and implement scalable, secure data platforms on Google Cloud using managed services (BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Composer)
Build reusable frameworks and tooling (ingestion, transformation, quality, orchestration) that can be adopted by multiple product and domain teams
Enable self‑service data consumption and governance by standardizing patterns, templates, and platform capabilities rather than one‑off pipelines
Design logical and physical data platform architectures leveraging BigQuery, Dataflow/Apache Beam, Dataproc/Spark, Pub/Sub, and Cloud Storage
Define and implement standardized ingestion, transformation, and serving patterns (batch and streaming) as reusable blueprints
Optimize cost, performance, and reliability of GCP data workloads (partitioning, clustering, storage classes, autoscaling strategies)
Build opinionated data ingestion frameworks (e.g., config‑driven pipelines, connectors, schema handling, error handling) on top of Dataflow, Dataproc, or Composer
Develop shared transformation libraries in Python/SQL/Beam (e.g., common SCD patterns, data quality checks, masking/tokenization routines)
Provide orchestration capabilities via Cloud Composer or Cloud Workflows with reusable DAGs/templates and CI/CD integration
Implement robust data modeling (dimensional, data vault, or canonical models) and semantic layers in BigQuery and related tools
Enforce data quality, lineage, and observability using standardized metrics, validation rules, and monitoring dashboards
Apply security and governance controls: IAM, VPC‑SC, CMEK, row/column‑level security, and policy‑driven access patterns
Partner with domain data engineers, analytics, and ML teams to onboard use cases onto platform services and frameworks
Document patterns, runbooks, and best practices, and provide enablement through workshops and code examples
Contribute to platform roadmap, tool selection, and evaluation of new GCP services and open‑source components
Requirements
5+ years of Database Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
5+ years of data management experience within Public Cloud (GCP, AWS, Azure)
5+ years of hands on experience of Python or Java, plus Spark SQL for building data pipelines, libraries, and automation tooling.
5+ years with orchestration tools (Cloud Composer/Airflow) and CI/CD (Cloud Build, Git‑based workflows) for data workloads
Tech Stack
Airflow
Apache
AWS
Azure
BigQuery
Cloud
Google Cloud Platform
Java
Python
Spark
SQL
Vault
Benefits
Health benefits
401(k) Plan
Paid time off
Disability benefits
Life insurance, critical illness insurance, and accident insurance