Sentara Health is hiring a Data Engineer to support a modern data platform built on Databricks. The role focuses on building scalable data pipelines, ensuring data quality and governance, and collaborating with stakeholders on data onboarding and requirements.

Responsibilities:

Develop and maintain data pipelines using PySpark and Databricks
Work within a metadata-driven ingestion framework to onboard new datasets
Implement data quality checks and validation rules within pipelines
Support ingestion from file-based sources and ingestion tools (e.g., Fivetran)
Handle schema changes, incremental loads, and file processing patterns
Contribute to data governance practices including tagging, metadata, and lineage
Troubleshoot and resolve pipeline failures and performance issues
Collaborate with architects and stakeholders on data onboarding and requirements
Follow and contribute to coding standards, reusable components, and best practices

Requirements:

Experience in lieu of a Bachelor's Degree
3+ years of relevant experience with a degree
5+ years of relevant experience without a degree
Required to 3 to 5 years of relevant experience
Hands-on experience with PySpark and Databricks
Strong SQL skills
Experience building ETL/ELT data pipelines
Understanding of Delta Lake concepts (merge, schema evolution, partitions)
Familiarity with cloud platforms (Azure preferred)
Basic experience with Git and version control
Exposure to data catalog or governance tools (e.g., DataHub)
Experience with Fivetran or similar ingestion tools
Understanding of data quality and validation concepts
Experience working with metadata-driven frameworks
Strong problem-solving and debugging skills
Ability to work in a structured, framework-driven environment
Focus on data quality, not just pipeline execution
Willingness to learn and adapt in a fast-evolving data ecosystem

Data Engineer - Remote

Key skills

About this role

Responsibilities:

Requirements: