Creating and managing data pipeline architecture for data ingestion, pipeline setup and data curation based in AWS;
Experiencing working with and creating cloud data solutions.
Assembling large, complex data sets that meet functional/non-functional business requirements
Implementing the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using PySpark, Hive, Iceberg, SQL and AWS big data-technologies
Building analytics tools that use the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
Manipulating data at scale: getting data in a ready-to-use state in close alignment with various business and stakeholders
Requirements
Bachelor's degree in computer science, software engineering, or comparable work experience
At least 5 years of experience as a data engineer or software developer, preferably in the automotive sector.
Strong experience Data Lake, Data Warehouse, RDS architectures knowledge
Experience Python, SQL (Any other OOP language is also valuable), PySpark (preferably) or Spark Knowledge
Proven experience in ETL
Strong knowledge of AWS namely S3, Athena, Lambda, Glue, IAM, SQS, etc.
Strong communication and analytical skills
Great English skills, both written and spoken.
Nice to have
Experience with AWS CDK Cloud Development Kit
Experience with CI/CD pipelines
Tech Stack
AWS
Cloud
ETL
PySpark
Python
Spark
SQL
Benefits
1200€ to equip your home office
Flex allowance: 720€/year (60€ per month x12)
Performance bonus
Health insurance extendable to your household
Wellbeing program extendable to your household
Life insurance
Pension fund
Smartphone & phone data plan
Discounts on VW Group cars and special financing conditions