Qualified Health is seeking a Data Integration Engineer to serve as the technical implementation specialist for healthcare data integration initiatives. The role involves designing and building robust data pipelines, ensuring data quality, and collaborating closely with a Data Integration Manager to meet partner needs.
Responsibilities:
- Design and build ETL pipelines using PySpark, SQL, and Azure data services to process healthcare data from multiple source systems
- Execute data extraction and transformation operations on complex healthcare datasets, ensuring accuracy and compliance with established standards
- Develop data quality validation frameworks to identify and resolve issues during integration, QC, and backtesting phases
- Troubleshoot technical issues including data schema mismatches, transformation logic errors, and performance bottlenecks
- Build reusable data components and standardized integration patterns that accelerate future implementations
- Optimize pipeline performance for large-scale healthcare datasets, ensuring efficient processing and resource utilization
- Implement data validation rules specific to healthcare contexts (e.g., clinical code validation, temporal logic checks, referential integrity)
- Write and maintain technical documentation for data pipelines, transformations, and integration patterns
- Support production deployments by coordinating with infrastructure teams and conducting final testing
- Partner with Data Integration Manager to translate partner requirements into technical specifications
- Participate in technical discussions with partner IT teams to understand data schemas, access methods, and integration constraints
- Provide technical guidance on data mapping specifications and transformation approaches
- Identify data quality issues and work with Manager to coordinate resolution with partners
- Share technical findings from QC and backtesting with Manager to inform partner conversations
- Contribute to continuous improvement of tools, processes, and technical standards
Requirements:
- 5+ years of experience in data analytics, data engineering, or solution delivery roles, with demonstrated expertise in data integration and ETL processes
- Strong analytical toolkit with proficiency in: PySpark for distributed data processing, Advanced SQL for data querying and transformation, Excel for data analysis and reporting
- Production ETL experience: Track record of building and maintaining production-grade data pipelines with proper error handling and monitoring
- Data quality focus: Experience implementing validation frameworks and troubleshooting data quality issues
- Healthcare data experience: Prior work with healthcare datasets (EHR, claims, clinical, lab data)
- Problem-solving mindset: Ability to independently diagnose and resolve complex technical issues
- Attention to detail: Commitment to accuracy, testing, and delivering reliable solutions
- Collaborative working style: Comfortable partnering with non-technical colleagues and adapting to feedback
- Bachelor's degree in Computer Science, Engineering, Data Science, Mathematics, or related technical field
- Epic Clarity experience: Direct work with Epic's relational database structure and clinical data models
- Healthcare data standards knowledge: Understanding of FHIR, HL7v2, DICOM, LOINC, SNOMED, ICD-10
- Azure cloud platform: Hands-on experience with Azure Databricks, Data Factory, Blob Storage, Delta Lake
- Healthcare compliance awareness: Understanding of HIPAA requirements and healthcare data security best practices
- Data warehouse/lakehouse experience: Familiarity with dimensional modeling and modern data architecture patterns
- DevOps practices: Experience with Git, CI/CD pipelines, and infrastructure-as-code
- Performance tuning: Proven ability to optimize complex data transformations for scale
- LIMS/PACS experience: Prior work integrating laboratory or imaging systems data
- Multiple data format fluency: Experience with JSON, XML, Parquet, CSV, and other healthcare interchange formats