Medasource is seeking a highly skilled Data Engineer to design, build, and optimize modern data pipelines within a cloud-first architecture. The role focuses on Azure Databricks and CI/CD best practices, playing a critical part in developing scalable data solutions that support analytics and advanced data initiatives including AI/ML.

Responsibilities:

Design, build, and maintain scalable data pipelines using Azure Databricks
Develop and optimize ETL/ELT workflows using Apache Airflow
Implement CI/CD pipelines and version control strategies using GitHub
Collaborate with Data Architects, Analysts, and Data Scientists to deliver high-quality datasets
Develop and maintain data models and transformations (batch and streaming)
Monitor data pipeline performance and troubleshoot production issues
Ensure data governance, security, and compliance standards are met
Optimize Spark workloads for performance and cost efficiency in Azure

Requirements:

3+ years of experience as a Data Engineer
Strong experience with Azure Databricks (Spark, PySpark, SQL)
Hands-on experience building and maintaining workflows in Apache Airflow
Proficiency with GitHub for version control and CI/CD
Strong SQL skills and experience with relational and cloud-based databases
Experience working in Azure cloud environments
Solid understanding of data modeling concepts (star schema, normalization, etc.)
Experience building scalable, production-grade data pipelines

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: