AirflowAmazon RedshiftApacheAWSAzureETLPythonSparkSQLTerraformMachine LearningELTAnalyticsBIRedshiftDatabricksApache SparkApache AirflowCloudFormationLambdaS3GlueAthenaAgileRemote Work
About this role
Role Overview
Lead the design, development, and maintenance of scalable and efficient data pipelines in AWS.
Design and implement data architectures using AWS services such as S3, Glue, Redshift, EMR, Lambda, Athena , among others.
Build and maintain robust, automated ETL/ELT pipelines.
Integrate data from multiple sources (APIs, databases, files, etc.), ensuring its quality and traceability.
Collaborate with Data Science, BI, and Product teams to understand data needs and deliver technical solutions.
Optimize performance of data ingestion and transformation processes.
Apply best practices for data security, governance, and regulatory compliance.
Document processes and uphold high development standards.
Requirements
Minimum of 5 years of experience as a Data Engineer , with a strong focus on AWS or Azure .
Proficiency in programming languages such as Python and SQL .
Hands-on experience with Databricks (Apache Spark) for large-scale data processing, ETL/ELT pipelines, and data preparation for analytics and machine learning.
Experience with orchestration tools like Apache Airflow or AWS Step Functions .
Solid understanding of data modeling , data lakes , and data warehouses .
Experience with data governance and metadata management tools , such as Microsoft Purview , including data cataloging, lineage, and data classification.
Experience with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation .
Familiarity with Agile methodologies and DevOps practices .
Strong technical English skills.
Tech Stack
Airflow
Amazon Redshift
Apache
AWS
Azure
ETL
Python
Spark
SQL
Terraform
Benefits
100% remote work
Flexible hours
Special timetable: Fridays and summer 7h.
Individual budget for attending forums and training