Fractal Analytics is a strategic AI partner to Fortune 500 companies, aiming to empower human decisions in enterprises. The Databricks Data Engineer will design and optimize scalable data pipelines for Claims Payment Integrity analytics, focusing on high-quality data availability for various analytics teams.
Responsibilities:
- Build scalable ETL/ELT pipelines in Databricks using PySpark, Spark SQL, Delta Live Tables, and workflows
- Engineer curated datasets across bronze/silver/gold layers for claims, pricing, provider, RCM, and member data
- Implement Delta Lake best practices including ACID transactions, schema evolution, CDC, and optimized storage formats
- Automate ingestion/transformation of large datasets from claims systems, provider files, call center platforms, and EHR feeds
- Perform reconciliation and validation of claim‑related financial datasets
- Enforce PHI‑compliant design patterns using Unity Catalog, governance guardrails, and cluster policies
- Implement pipeline monitoring, logging, and Spark performance optimization
- Work with Data Analysts, Data Scientists, and PI SMEs to translate analytic requirements into production data assets
- Support cluster optimization, table indexing (Z‑ORDER), and cost‑efficient lakehouse operations
- Participate in Agile ceremonies and ensure timely delivery of engineering tasks
Requirements:
- Hands-on experience with Databricks (PySpark, SQL, Delta Lake, Jobs/Workflows)
- Strong Spark performance tuning experience
- Experience engineering data for claims, provider, and membership domains
- Strong understanding of healthcare data models and adjudication flows
- Typically 5–8 years of Data Engineering experience in healthcare
- Bachelor's degree (4‑year)
- Experience with Call center data (member & provider interactions), Provider RCM datasets, and EHR/clinical data
- Experience with DLT, CI/CD, and MLflow‑integrated pipelines
- Exposure to actuarial or PI forecasting workflows