AirflowAmazon RedshiftApacheAWSAzureBigQueryCassandraCloudDockerETLJavaKafkaKubernetesMongoDBNoSQLPostgresPythonScalaSparkSQLELTData EngineeringData WarehousingData LakeSnowflakeRedshiftApache SparkApache KafkaApache AirflowdbtS3GluePostgreSQLGitVersion ControlCI/CDCommunicationRemote Work
About this role
Role Overview
Design, implement, and maintain robust and scalable data pipelines using AWS, Azure, and containerization technologies.
Develop and maintain ETL/ELT processes to extract, transform, and load data from various sources into data warehouses and data lakes.
Collaborate with data scientists, analysts, and other engineers to ensure seamless data flow and availability across the organization.
Optimize data storage and retrieval performance by utilizing cloud services like AWS Redshift, Azure Synapse, or other relevant technologies.
Work with containerization tools like Docker and Kubernetes to ensure smooth deployment, scalability, and management of data pipelines.
Monitor, troubleshoot, and optimize data processing pipelines for performance, reliability, and cost-efficiency.
Automate manual data processing tasks and improve data quality by implementing data validation and monitoring systems.
Implement and maintain CI/CD pipelines for data workflow automation and deployment.
Ensure compliance with data governance, security, and privacy regulations across all data systems.
Participate in code reviews and ensure the use of best practices and documentation for data engineering solutions.
Stay up-to-date with the latest data engineering trends, cloud services, and technologies to continuously improve system performance and capabilities.
Requirements
Excellent communication skill, specifically fluent verbal English is mandatory to explain complex technical concepts to non-technical stakeholders and collaborate across teams.
Proven experience as a Data Engineer, with hands-on experience building and managing data pipelines.
Strong proficiency in cloud technologies, specifically AWS (e.g., S3, Redshift, Glue) and Azure (e.g., Data Lake, Azure Synapse).
Experience working with containerization and orchestration tools such as Docker and Kubernetes.
Proficient in data engineering programming languages, such as Python, Java, or Scala.
Solid experience with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB, Cassandra).
Familiarity with data processing frameworks like Apache Spark, Apache Kafka, or similar tools.
Experience with workflow orchestration tools like Apache Airflow, DBT, or similar.
Knowledge of data warehousing concepts and technologies (e.g., Snowflake, Amazon Redshift, or Google BigQuery).
Strong understanding of ETL/ELT processes and best practices.
Experience with version control systems like Git.
Strong problem-solving skills and a proactive approach to troubleshooting and optimization.
Tech Stack
Airflow
Amazon Redshift
Apache
AWS
Azure
BigQuery
Cassandra
Cloud
Docker
ETL
Java
Kafka
Kubernetes
MongoDB
NoSQL
Postgres
Python
Scala
Spark
SQL
Benefits
Competitive salary and flexible payment method.
Opportunities for growth and professional development.
Flexible working hours and full remote work opportunity.
Work in a collaborative, innovative and inclusive environment.
Be a part of a data-driven culture that is at the forefront of innovation.