Build and manage reliable data pipelines involving ingestion/collection, processing, integration, storage, and provisioning of data across the organization.
Work within a distributed systems architecture for parallel massive data processing (MPP), combining multiple heterogeneous data sources and collaborating with analytics and data science teams to build solutions and generate data-driven value.

Hands-on experience with ingestion, integration, processing, and storage of large volumes of data;
Experience working on Big Data projects;
Behavior Driven Development (BDD);
Data extraction in Python and data processing with PySpark;
Experience with ETL tools;
Knowledge of relational and dimensional data modeling (Data Warehouse);
Experience with SQL databases;
Experience with AWS Big Data-related tools such as EMR, Kinesis, Redshift, S3, Glue, Elasticsearch;
Knowledge of Kafka;
Familiarity with Data Lake and DataOps;
Preferred/Differential: AWS certifications;
Experience with infrastructure-as-code tools for cloud provisioning such as Terraform and CloudFormation.

Mid-level Data Engineer

Key skills