Job Title: PySpark Data Engineer – HL7 Integration Specialist
Job Summary
We are seeking a highly skilled PySpark Data Engineer with strong experience in HL7 healthcare data standards and large-scale data processing. The ideal candidate will be responsible for designing, developing, and optimizing healthcare data integration pipelines, transforming HL7 messages, and enabling efficient data exchange across healthcare systems.
Key Responsibilities
Design, develop, and maintain scalable data pipelines using PySpark.
Process, transform, and validate healthcare data based on HL7 standards (any version such as HL7 v2.x, HL7 v3, or FHIR).
Build and optimize ETL/ELT workflows for healthcare interoperability solutions.
Collaborate with business stakeholders, architects, and healthcare domain teams to understand data integration requirements.
Ensure data quality, integrity, security, and compliance with healthcare regulations.
Troubleshoot and resolve data processing and integration issues.
Develop reusable frameworks and best practices for healthcare data ingestion and transformation.
Monitor pipeline performance and implement optimization strategies.
Create and maintain technical documentation and data mapping specifications.
Required Skills (Must Have)
Strong hands-on experience with PySpark and distributed data processing
Solid understanding of HL7 standards (any version) and healthcare data exchange concepts
Experience designing and implementing data transformation and integration solutions
Proficiency in SQL and data modeling concepts
Experience working with large-scale datasets and ETL processes
Strong problem-solving and analytical skills
Excellent communication and collaboration abilities
Preferred Skills (Good to Have)
Experience with SMILE CDR.
Knowledge of Google Cloud Platform (GCP) services
Experience with Informatica (PowerCenter, IICS, or related tools)
Familiarity with healthcare interoperability standards such as FHIR
Exposure to cloud-based data engineering architectures
Qualifications
Bachelor's or Master's degree in Computer Science, Information Technology, Engineering, or a related field.
Relevant experience in data engineering, healthcare interoperability, or healthcare data integration projects.
Experience
4+ years of experience in Data Engineering
2+ years of hands-on experience with PySpark and healthcare data integration using HL7