Rapid Eagle Inc is seeking a Senior Data Engineer to build the core ingestion layer of a modern healthcare data platform. The role involves porting a high-volume event pipeline into Databricks, integrating with Kafka, and implementing complex patient matching and event handling logic.
Responsibilities:
- Port the core ingestion pipeline into a Databricks-native architecture conforming to standard ingestion layer patterns
- Implement per-job success/failure tracking and metrics capture in alignment with platform engineering standards
- Integrate with bulk patient matching libraries to accurately process patient update signals and lifecycle events
- Build event handlers for patient merges, practice merges, and facility-level changes
- Develop Databricks-to-Kafka (D2K) jobs for ingestion model outputs and downstream event streams
- Ensure the solution is low-maintenance, well-documented, and observable in production
Requirements:
- Databricks — Delta Lake, Jobs, cluster management, Notebooks
- Apache Kafka — producer/consumer patterns, event-driven architecture
- PySpark / Spark Structured Streaming
- Python — advanced data engineering
- High-volume stateful event stream processing
- Experience porting or refactoring large-scale data pipelines
- Delta Live Tables (DLT)
- Databricks Asset Bundles / CI/CD for Databricks
- Confluent Kafka or AWS MSK
- dbt on Databricks
- Healthcare data or patient identity matching experience