Role Overview
- Build and maintain data pipelines for ingestion, transformation, and export across multiple sources and destinations
- Develop and evolve scalable data architecture to meet business and performance requirements
- Partner with analysts and data scientists to deliver curated, analysis-ready datasets and enable self-service analytics
- Implement best practices for data quality, testing, monitoring, lineage, and reliability
- Optimize workflows for performance, cost, and scalability (e.g., tuning Spark jobs, query optimization, partitioning strategies)
- Ensure secure data handling and compliance with relevant data protection standards and internal policies
- Contribute to documentation, standards, and continuous improvement of the data platform and engineering processes
- Ensure secure, compliant handling of data and models, including access controls, auditability, and governance practices
- Build and maintain MLOps automation: CI/CD for ML, environment management, artifact handling, versioning of data/models/code
Requirements
- Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
- 6+ years of experience as a Data Engineer, building and maintaining production-grade pipelines and datasets
- Strong Python and SQL skills with a solid understanding of data structures, performance, and optimization strategies
- Hands-on experience with orchestration (like Airflow, Dagster, Databricks Workflows) and distributed processing in a cloud environment
- Experience with analytical data modeling (star and snowflake schemas), DWH, ETL/ELT patterns, and dimensional concepts
- Experience building reliable incremental data ingestion pipelines from DBs and APIs.
- Familiarity with at least one major cloud provider (GCP, AWS, Azure) and deploying data solutions in the cloud
- Familiarity with CI/CD for data pipelines, IaC (Terraform), and/or DataOps practices
- Strong troubleshooting mindset: ability to debug issues across data, infra, pipelines, and deployments
- Collaborative mindset and clear communication across engineering, analytics, and business stakeholders
Nice to have / big advantage
- Strong GCP experience and ecosystem knowledge: BigQuery, Composer, Dataproc, Cloud Run, Dataplex, Cloud Storage
- Experience with data governance concepts: access control, retention, data classification, auditability, and compliance standards
- Experience building observability for data systems (metrics, alerting, data quality checks, incident response)
- Knowledge of model monitoring concepts: drift, data quality issues, performance degradation, bias checks, and alerting strategies.
Tech Stack
- Airflow
- AWS
- Azure
- BigQuery
- Cloud
- ETL
- Google Cloud Platform
- Python
- Spark
- SQL
- Terraform
Benefits
- Excellent compensation package
- myPOS Academy for upskilling and training
- Unlimited access to courses on LinkedIn Learning
- Refer a friend bonus as we know that working with friends is fun
- Teambuilding, social activities and networks on a multi-national level