Develop, maintain, and optimize data pipelines, ensuring efficiency, scalability, and reliability when processing large volumes of data.
Contribute to designing and evolving the data architecture, applying best practices for modeling, ingestion, and processing to maintain well-organized environments.
Implement and promote data governance best practices, including standardization, data quality, cataloging, and access control.
Collaborate with stakeholders across different areas, understanding business needs and translating them into data solutions.
Ensure data quality and integrity throughout the entire pipeline (ingestion, transformation, and consumption).
Monitor, identify, and resolve issues in data pipelines and data flows, ensuring high availability.
Support continuous improvement initiatives for the data platform, focusing on performance, cost reduction, and governance.
Requirements
Bachelor's degree completed in Systems Analysis and Development, Software Engineering, Business Administration, or related fields.
Minimum of 2 years of proven experience working with data.
Python: Development and automation of data pipelines.
Databricks: Analytics and big data processing platform.
SQL/NoSQL: Manipulation and querying of relational and non-relational databases.
Desired: Data Architecture: Design and implementation of scalable solutions.
AWS: Cloud services for data architecture.
Apache Airflow: Orchestration and scheduling of workflows.