As a Data Engineer, you will spend your time helping clients achieve their data engineering needs.
You’ll also make a real impact by taking an active role in the team’s Agile Development practices, technical decision making and development, generating value and continuously striving to improve the quality and reliability of our data and processes.
You will spend the majority of your time helping deliver client data engineering opportunities and supporting the growth of our junior-mid level technologists; both within AND and for our clients.
Stack Migration: Migrating the existing data processing stack (PostgreSQL/Jenkins/Python) to a modern cloud-based stack (AWS/Airflow/Redshift/Python/PySpark/Spark), ensuring continuity and data integrity throughout the transition.
Design and Build Scalable ETL Pipelines: Build and optimise ETL/ELT data pipelines, focusing on performance, scalability, and data integrity.
Maintain and Develop Data Pipelines: Support and maintain existing pipelines while building new ones to meet evolving business needs and stakeholder requirements.
Streaming & Integration: Develop and maintain both batch and real-time data processing pipelines from multiple data sources.
Support DevOps Practices: Contribute to CI/CD processes for testing, building, and deploying data pipelines.
Performance Optimisation: Tune pipeline performance for both batch and real-time data processing.
Collaboration and Mentorship: Partner with internal teams and engineers to ensure seamless data integration.
Requirements
Complete fluency in Dutch is required for this role in order to communicate with our Dutch Stakeholders.
Strong expertise in Python and PySpark/Spark for building scalable data pipelines
Strong knowledge of SQL for querying structured data and interacting with databases, including PostgreSQL
Strong understanding of data modelling, ETL/ELT processes and data governance
Experience with data warehousing concepts and delivering analytical insights to internal stakeholders
Deep expertise in at least one major cloud platform (AWS, Azure, or GCP)
Experience with Apache Airflow for workflow orchestration and pipeline scheduling
Hands-on experience with AWS services, particularly Glue and Redshift
Experience with Apache Kafka or similar streaming technologies
Experience with Delta Lake/Delta Table, Lakehouse or Medallion Architecture (Bronze/Silver/Gold)
Deep expertise in modern data platforms such as Databricks or Snowflake
Experience with data migration, including familiarity with tools such as AWS DMS
Consulting or client-facing experience, translating business requirements into data solutions
Familiarity with Data Vault 2.0 and Star Schema design
Infrastructure as Code experience (Terraform or AWS CloudFormation)
Familiarity with enterprise tooling such as Informatica (ETL, Data Governance, Data Quality) or Data Mesh enablers