Lead the design and implementation of robust pipelines for large-scale data ingestion and processing, both batch and streaming, ensuring low latency, high availability, and resilience.
Define and maintain data architecture standards, including partitioning, versioning, security, and governance, ensuring consistency across domains and consumer teams.
Ensure data reliability, quality, and traceability through observability, monitoring, automated testing, and Data Quality practices across the entire data lifecycle.
Act as the technical point of contact for product and business stakeholders, translating strategic needs into scalable solutions and clearly communicating trade-offs (time, cost, performance, and risk).
Evaluate and recommend technologies and frameworks for the data stack, ensuring continuous evolution with a focus on operational efficiency and maintainability.
Conduct performance and cost diagnostics and optimizations for queries, jobs, storage, and orchestration.
Support the technical development of the team through code reviews, mentoring, and knowledge sharing, raising standards and accelerating engineering best practices.
Collaborate with other disciplines (Analytics Engineers, Data Analysts, Software Engineers, etc.) to integrate end-to-end solutions and unblock dependencies between teams.
Requirements
Bachelor's degree in a quantitative field (Computer Science, Engineering, Mathematics, etc.).
Strong experience in Data Engineering, with autonomy to lead projects and make high-impact technical decisions.
Advanced knowledge of Microsoft Azure services and features.
Experience with parallel processing and messaging technologies (Hadoop, Spark, Kafka, etc.).
Proficiency in data ingestion, integration, and orchestration tools (Azure Data Factory, Airflow, etc.).
Experience with large-scale file storage environments (HDFS, blob storage, Data Lake, etc.).
Strong mastery of relational databases (queries, programming, modeling, and performance).
Development experience in programming languages (Python, Spark (PySpark), and/or Scala).
Experience building and maintaining REST APIs and cloud functions (Azure).
Ability to communicate technical decisions clearly and influence stakeholder alignment.
The following are a plus:
Azure certification (Microsoft Certified: Azure Data Engineer Associate).
Databricks certification (Data Engineer Associate or Professional).
Knowledge of Data Mesh architecture focused on data domains and federation between platform and product teams.
Experience with NoSQL databases (Cassandra, MongoDB, etc.).
Knowledge of user event capture and tracking solutions (Mixpanel, Segment, etc.).
Experience with data contracts, quality standards, and CI/CD automation for data pipelines.
Experience with lakehouse architectures and open storage formats (Delta Lake, Apache Iceberg, etc.).
Tech Stack
Airflow
Apache
Azure
Cassandra
Cloud
Hadoop
HDFS
Kafka
MongoDB
NoSQL
Python
Scala
Spark
Benefits
🕘 Flexible Hours: Greater autonomy to organize your schedule with balance and responsibility.
🏠 Flexible Work Arrangements: Remote, hybrid, or on-site, depending on role requirements.
👕 No Dress Code: Freedom to be yourself, without formalities.
🎉 Birthday Day Off: One day off in your birthday month to celebrate as you wish.
🔋 Blip Recharge: 5 paid days off per year for roles without time tracking, designed to help balance work.
🍽️ Meal Allowance or Food Voucher: R$ 1,144.00 per month, no deductions and credited during vacations and leaves.
🚍 Transportation Voucher: Available as needed for commuting.
🏋️ Wellhub (Gympass): Access to gyms, wellness apps, and fitness activities, also available for dependents.
🎭 SESC Partnership: Access to culture, leisure, sports, hotels, holiday camps, and more.
🩺 Health Insurance (SulAmérica): National coverage, private room for you and your dependents, with only co-payments applied.
🦷 Dental Plan: National coverage for you and your dependents, with three plan options and full company coverage of the chosen plan's cost.
🏳️🌈 Your Name Matters: Reimbursement of up to R$250.00 for expenses related to first name and/or gender marker changes, supporting inclusion and respect for identity.