AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformPySparkPythonScalaSparkSQLTerraformUnityVaultAIMLMLOpsMLflowELTData EngineeringData LakeDatabricksApache SparkGCPGoogle CloudS3GCSGitPerformance OptimizationCI/CDCollaborationRemote Work
About this role
Role Overview
Advanced Design & Implementation: Designing and implementing robust, scalable, high-performance ETL/ELT data pipelines using PySpark/Scala and Databricks SQL on the Databricks platform.
Delta Lake: Expertise in implementing and optimizing the Medallion architecture (Bronze, Silver, Gold) using Delta Lake to ensure data quality, consistency, and historical tracking.
Lakehouse Platform: Efficient implementation of the Lakehouse architecture on Databricks, combining best practices from DWH and Data Lake environments.
Performance Optimization: Optimizing Databricks clusters, Spark operations, and Delta tables (e.g., Zordering, compaction, query tuning) to reduce latency and compute costs.
Streaming: Designing and implementing real-time/near–real-time data processing solutions using Spark Structured Streaming and Delta Live Tables (DLT).
Unity Catalog: Implementation and administration of Unity Catalog for centralized data governance, fine-grained security (row
and column-level security), and data lineage.
Data Quality: Defining and implementing data quality standards and rules (e.g., using DLT or Great Expectations) to maintain data integrity.
Orchestration: Developing and managing complex workflows using Databricks Workflows (Jobs) or external tools (e.g., Azure Data Factory, Airflow) to automate pipelines.
DevOps/CI/CD: Integrating Databricks pipelines into CI/CD processes using tools such as Git, Databricks Repos, and Bundles.
Collaboration: Working closely with Data Scientists, Analysts, and Architects to understand business requirements and deliver optimal technical solutions.
Mentorship: Providing technical guidance to junior developers and promoting best practices.
Requirements
Professional Experience: Minimum 5+ years of experience in Data Engineering, including at least 3+ years working with Databricks and large-scale Spark.
Databricks Platform: Proven, expert-level experience with the full Databricks ecosystem (Workspace, Cluster Management, Notebooks, Databricks SQL).
Apache Spark: Deep knowledge of Spark architecture (RDD, DataFrames, Spark SQL) and advanced optimization techniques.
Delta Lake: Expertise in implementing and administering Delta Lake (ACID properties, Time Travel, Merge, Optimize, Vacuum).
SQL: Advanced/expert skills in SQL and Data Modeling (Dimensional, 3NF, Data Vault).
Cloud: Strong experience with a major Cloud platform (AWS, Azure, or GCP), particularly with storage services (S3, ADLS Gen2, GCS) and networking.
Unity Catalog: Hands-on experience with implementing and administering Unity Catalog.
Lakeflow: Experience with Delta Live Tables (DLT) and Databricks Workflows.
ML/AI Fundamentals: Understanding of basic MLOps concepts and experience with MLflow to support integration with Data Science teams.
DevOps: Experience with Terraform or equivalent tools for Infrastructure as Code (IaC).
Certifications: Databricks certifications (e.g., Databricks Certified Data Engineer Professional) are a strong advantage.
Tech Stack
Airflow
Apache
AWS
Azure
Cloud
ETL
Google Cloud Platform
PySpark
Python
Scala
Spark
SQL
Terraform
Unity
Vault
Benefits
Premium medical package
Lunch Tickets & Pluxee Card
Bookster subscription
13th salary/yearly bonuses
Enterprise job security with a startup mentality (diverse & engaging environment, international exposure, flat hierarchy) under the stability of a secure multinational
A supportive culture (we value ownership, autonomy, and healthy work-life balance) with great colleagues, team events and activities
Flexible working program and openness to remote work
Collaborative mindset – employees shape their own benefits, tools, team events and internal practices
Diverse opportunities in Software Development with international exposure
Flexibility to choose projects aligned with your career path and technical goals
Access to leading learning platforms, courses, and certifications (Pluralsight, Udemy, Microsoft, Google Cloud)
Career growth & learning – mentorship programs, certifications, professional development opportunities, and above-market salary