Mphasis is a company that specializes in providing technology and business solutions. They are seeking a Senior Data Engineer to architect and manage scalable, secure, cloud-native data platforms primarily on Google Cloud Platform, while optimizing data pipelines and enabling AI/ML integrations.
Responsibilities:
- Architect and own scalable, secure, cloud-native data platforms on Google Cloud Platform
- Design, build, and optimize batch and real-time data pipelines using BigQuery, Dataflow, Pub/Sub, and Dataproc
- Lead BigQuery performance tuning and cost optimization (partitioning, clustering, query efficiency)
- Orchestrate workflows using Cloud Composer (Apache Airflow)
- Enable Al/ML and GenAl integration via Vertex Al and BigQuery ML
- Enforce data governance, security, reliability, and FinOps best practices
- Mentor engineers, conduct design/code reviews, and set enterprise data engineering standards
- Collaborate with product, analytics, and data science teams to deliver business-critical insights
Requirements:
- 5 - 8 Years of experience
- Architect and own scalable, secure, cloud-native data platforms on Google Cloud Platform
- Design, build, and optimize batch and real-time data pipelines using BigQuery, Dataflow, Pub/Sub, and Dataproc
- Lead BigQuery performance tuning and cost optimization (partitioning, clustering, query efficiency)
- Orchestrate workflows using Cloud Composer (Apache Airflow)
- Enable Al/ML and GenAl integration via Vertex Al and BigQuery ML
- Enforce data governance, security, reliability, and FinOps best practices
- Mentor engineers, conduct design/code reviews, and set enterprise data engineering standards
- Collaborate with product, analytics, and data science teams to deliver business-critical insights
- GCP Data Services: BigQuery, Dataflow (Apache Beam), Pub/Sub, Cloud Storage, Cloud Composer, Dataproc
- Advanced SQL
- Python (Java/Scala a plus)
- ETL/ELT, streaming & batch processing, data modeling, distributed systems
- Lakehouse, Apache Iceberg, Data Mesh concepts
- Vertex Al, BigQuery ML, GenAl-ready pipelines
- Terraform, CI/CD, DataOps practices
- Architecture ownership, mentoring, stakeholder communication, problem solving
- Google Cloud Professional Data Engineer (strongly preferred / often mandatory)
- Looker
- GCP vertex
- Data virtualization (Trenio or equivalent)
- basic knowledge of network connectivity (knowledge on data catalog, DLP, BQDTS, STS and other data transfer methodologies)
- Reporting background (powerbi)
- ICEBERG