Develop and scale data ingestion pipelines, ensuring lineage and documentation for traceability;
Implement and maintain automated data tests (e.g., schema validation, business rules, consistency and completeness) throughout pipelines;
Optimize ETL processes with a focus on latency and cost efficiency;
Define and monitor data SLAs/SLOs, ensuring availability, freshness and reliability of information;
Implement data observability practices, including monitoring of volume, distribution, anomalies and pipeline failures;
Participate in data incident analysis, performing root cause identification and implementing preventive actions;
Contribute to data documentation, cataloging and lineage to facilitate traceability and trust in the data.
Requirements
Experience with data-engineering-focused programming, including manipulation of complex structures, exception handling and use of libraries such as Pandas, PySpark or Boto3;
Strong experience with Apache Airflow, capable of developing, versioning and troubleshooting complex DAGs while applying coding best practices;
Practical experience with the AWS ecosystem, with a focus on:
Processing and Storage: S3, EC2 and AWS Glue;
Migration and Ingestion: proficiency in AWS DMS (Data Migration Service) for data replication.
Proficiency with relational databases (MySQL and PostgreSQL) and experience with non-relational databases (MongoDB), understanding their use cases, data modeling and optimizations;
Knowledge of MLflow for experiment tracking and model lifecycle management;
Experience writing advanced queries and optimizing performance for large volumes of data.
Tech Stack
Airflow
Apache
AWS
EC2
ETL
MongoDB
MySQL
Pandas
Postgres
PySpark
Benefits
Health insurance;
Dental insurance;
Meal allowance / food allowance;
Childcare assistance;
Life insurance;
Profit-sharing program (PPR);
Day off during your birthday month;
Wellhub;
Férias&Co (travel benefit);
6 months maternity leave and 20 days paternity leave;
Flexible working hours;
Partnerships with various establishments and institutions in the areas of education, health, leisure, entertainment, and others.