Design, develop, and implement robust data pipelines, integrating data from diverse sources primarily using tools from the AWS ecosystem;
Understand business and software product requirements and translate them into efficient, scalable data products;
Serve as a technical reference for the architectural design of data-driven and AI-first solutions, coordinating across product, analytics, and ML to define standards, best practices, SLAs, observability, and platform governance;
Implement scalable, efficient data enrichment solutions using DBT to transform raw data into tailored data models and actionable business insights;
Manage integrations, transformation processes, and task automation using data connectors, SQL APIs, Python, and REST;
Use SQL for querying and manipulating data in relational and non-relational databases;
Apply Spark for processing large volumes of data in real-time (Kafka and similar) and batch.
Requirements
AWS (Athena, S3, EMR, Glue, Lambda);
Proficiency in SQL, DBT, Python, and Spark;
Experience with data orchestration tools such as Apache Airflow;
Knowledge of different data architecture and modeling techniques for analytics (e.g., Data Mesh, Data Mart, Data Lake, Data Warehouse);
Experience with data modeling (relational and/or analytical) and ETL processes;
Experience in software development (APIs, gateways, microservices, and containers) and CI/CD;
Familiarity with software engineering best practices (documentation, automated testing, clean code, monitoring (Datadog)).
Tech Stack
Airflow
Apache
AWS
ETL
Kafka
Python
Spark
SQL
Benefits
Profit-sharing
Company car
Food allowance
Meal allowance
Health insurance
Dental insurance
Gympass
Private pension plan
Home office allowance
Allya
Unlimited access to courses from our Localiza University