As part of the Data Engineer team, you will be responsible for design, development and operations of large-scale data systems operating at petabytes scale.
You will be focusing on real-time data pipelines, streaming analytics, distributed big data and machine learning infrastructure.
You will interact with the engineers, product managers, BI developers and architects to provide scalable robust technical solutions.
Requirements
Experiencia sólida en ingeniería de datos, manejo de grandes volúmenes de información, ETL, modelado de datos, y herramientas como Spark, Python, SQL, etc.
Min 6-8 años de BIG data development experience.
Demonstrates up-to-date expertise in Data Engineering, complex data pipeline development.
Experience in agile models
Design, develop, implement and tune large-scale distributed systems and pipelines that process large volume of data; focusing on scalability, low -latency, and fault-tolerance in every system built.
Experience with Java, Python to write data pipelines and data processing layers
Experience in Airflow & Github.
English conversational
Experience in writing map-reduce jobs.
Demonstrates expertise in writing complex, highly-optimized queries across large data sets
Proven, working expertise with Big Data Technologies Hadoop, Hive, Kafka, Presto, Spark, HBase.
Highly Proficient in SQL.
Experience with Cloud Technologies ( GCP, Azure)
Experience with relational model, memory data stores desirable ( Oracle, Cassandra, Druid)
Provides and supports the implementation and operations of the data pipelines and analytical solutions
Performance tuning experience of systems working with large data sets
Experience in REST API data service – Data Consumption
Retail experience is a huge plus.
Tech Stack
Airflow
Azure
Cassandra
Cloud
Distributed Systems
ETL
Google Cloud Platform
Hadoop
HBase
Java
Kafka
Oracle
Python
Spark
SQL
Benefits
Sueldo acorde a experiencia
Prestaciones de Ley
Días extra con goce se sueldo
Convenios en escuelas de idiomas, entretenimiento, cursos y más