Gentiva is a company focused on transforming the delivery of data-driven insights in healthcare. The Databricks Data Engineer will design and engineer robust data pipelines, optimize performance, and ensure data accuracy and security.

Responsibilities:

Translate business requirements into technical specifications and document solution designs, data flows and architecture
Design, develop, and maintain ETL/ELT pipelines using Azure Data Factory, Databricks and Apache Spark
Implement Delta Lake architecture for reliable data storage and processing
Build and optimize data workflows using Databricks Workflows and Jobs
Develop scalable data models following medallion architecture (bronze, silver, gold layers)
Implement Unity Catalog for data governance, access control, and metadata management
Create and maintain Databricks notebooks for data transformation and analysis
Optimize Spark jobs for performance and cost efficiency
Implement data quality checks and validation frameworks
Collaborate with BI developers, data analysts, and data scientists
Design and implement data orchestration workflows using Azure Data Factory to coordinate complex ETL/ELT processes
Develop and maintain CI/CD pipelines for data workflows
Monitor data pipeline performance and troubleshoot issues
Document data processes, architectures, and best practices
Ensure compliance with data security and privacy regulations
Provide support for new and existing solutions

Requirements:

Bachelor's degree in Computer Science, Information Technology or related field
5+ years of progressive experience in data engineering, analytics, or software development
3+ years of hands-on experience with Databricks platform
Strong experience with Apache Spark and PySpark
Excellent problem-solving and analytical skills
Strong oral and written communication abilities
Self-motivated with ability to adapt to new technologies quickly
Team player with ability to work independently
Detail-oriented with strong organizational skills
Ability to manage multiple priorities and meet deadlines
Experience communicating technical concepts to non-technical stakeholders
Expert-level knowledge of Databricks Workspace, clusters, and notebooks
Delta Lake implementation and optimization
Unity Catalog for data governance and cataloging
Databricks SQL and SQL Analytics
Databricks Workflows, Delta Live Tables, and job orchestration
Delta Live Tables (DLT) for pipeline orchestration and data quality
Advanced Python programming (PySpark, pandas, NumPy)
Advanced SQL (query optimization, performance tuning)
Git version control and collaborative development
Azure Databricks
Cloud storage services (ADLS Gen2, Azure Blob Storage)
Azure Data Factory for pipeline orchestration and integration
Experience designing and managing Azure Data Factory pipelines, triggers, and linked services
Infrastructure as Code (Terraform)
Experience with BI tools (Power BI, SSRS)
Data warehousing and data modeling concepts
SQL Server, including SSIS (Integration Services)
Scala programming
Healthcare IT or healthcare data experience
Databricks Certified Data Engineer Associate (strongly preferred)
Databricks Certified Data Engineer Professional
Databricks Lakehouse Fundamentals
Azure Data Engineer Associate (DP-203)
Apache Spark certifications
Experience with complex data modeling including dimensional modeling, star/snowflake schemas
Experience with medallion architecture (bronze/silver/gold layers)
Data quality and validation framework implementation
CI/CD pipeline development for data workflows (Azure DevOps)
Performance tuning and cost optimization
DataOps and DevOps practices

Databricks Data Engineer - Remote

Key skills

About this role

Responsibilities:

Requirements: