Banner Health is a nationally-recognized healthcare leader, and they are seeking a Senior Data Engineer to design, build, and scale their next-generation data platform on the cloud. This role involves engineering reliable, secure, and scalable data products and pipelines that power analytics, reporting, AI, and operational insights across clinical and business domains.
Responsibilities:
- Designs, builds, and optimizes scalable batch and streaming data pipelines for enterprise analytics and operational use cases
- Contributes to the evolution of the enterprise data platform to support advanced analytics, self-service consumption, and AI/ML use cases
- Develops curated, high-quality data products across lakehouse, warehouse, and domain-oriented data architectures
- Builds and enhances cloud-native data solutions using modern platforms such as Databricks, Spark, Delta Lake, and AWS services
- Enforces engineering standards for code quality, testing, CI/CD, observability, lineage, and documentation
- Drives best practices for data quality, schema evolution, performance tuning, reliability, and cost optimization
- Partners with architecture, governance, security, and analytics teams to implement trusted and compliant data solutions
- Supports ingestion and transformation of complex healthcare and enterprise data sources, including structured, semi-structured, and high-volume event data
- Translates business and operational requirements into scalable technical designs and production-ready data pipelines
Requirements:
- Must possess strong knowledge of data engineering and analytics as normally obtained through the completion of a Bachelor's degree in Data Science, Computer Science, Information Technology or a related field
- Must have 4+ years of experience in data engineering, big data, analytics engineering, or data platform development
- Must have strong hands-on experience with Databricks, Apache Spark, Delta Lake, and cloud-based data platforms on AWS, Azure, or GCP
- Deep expertise in SQL and strong programming skills in Python and/or Java
- Experience building and operating large-scale distributed data systems, including data lakes, lakehouses, warehouses, or mesh-oriented platforms
- Must have a strong understanding of data modeling, partitioning, storage design, metadata management, and performance optimization
- Experience implementing data quality, lineage, observability, and operational monitoring in production environments
- Familiarity with orchestration, DevOps, and CI/CD practices for data platforms
- Must have strong communication skills and the ability to work effectively with technical and non-technical stakeholders
- Proven ability to balance multiple priorities in a fast-paced environment while maintaining high engineering standards
- Experience in healthcare, payer-provider, clinical, or regulated data environments
- Familiarity with EHR, claims, FHIR, HL7, interoperability, or other healthcare data standards
- Knowledge and experience with OMOP on AWS
- Experience with Verato and patient identity systems at scale
- Knowledge and experience working with large datasets 300 TB and greater
- Databricks accreditations and certifications
- Experience with governance and secure access models using tools such as Unity Catalog, Lake Formation, or equivalent
- Experience supporting AI/ML, feature engineering, vector or unstructured data pipelines, or data products for GenAI use cases
- Exposure to infrastructure-as-code, automated testing, and platform engineering practices
- Experience mentoring engineers and influencing cross-functional technical direction
- Additional related education and/or experience preferred