Design, develop and implement robust data pipelines, connecting data from multiple sources using primarily tools from the AWS ecosystem;
Understand business requirements and those of various software products and translate them into efficient, scalable data products;
Serve as a technical reference for the architectural design of data-driven (Data-Driven) and AI-first (AI-First) solutions, coordinating between product, analytics and ML to define standards, best practices, SLAs, observability and platform governance;
Implement scalable and efficient data enrichment solutions using DBT to transform raw data into tailored data models and actionable business insights;
Manage integrations, transformation processes and task automations using data connectors, SQL APIs, Python and REST;
Use SQL to query and manipulate data in relational and non-relational databases;
Apply Spark for processing large volumes of data in real time (Kafka and similar) and in batch.
Requirements
AWS (Athena, S3, EMR, Glue, Lambda);
Proficiency in SQL, DBT, Python and Spark;
Experience with data orchestration tools such as Apache Airflow;
Knowledge of different data architecture and modeling techniques for analytics purposes (e.g., Data Mesh, Data Mart, Data Lake, Data Warehouse);
Experience with Data Modeling (relational and/or analytical) and ETL processes;
Experience in Software Development (APIs, gateways, microservices and containers) and CI/CD;
Understanding of software engineering best practices (documentation, automated testing, clean code, monitoring (Datadog)).
Tech Stack
Airflow
Apache
AWS
ETL
Kafka
Python
Spark
SQL
Benefits
Profit sharing
Company car
Food allowance
Meal voucher
Health insurance
Dental insurance
Gympass
Private pension plan
Home office allowance
Allya
Unlimited access to various courses from our Localiza University