Data Integration Engineer – Healthcare Data Infrastructure
United States
Full Time
5 hours ago
$130,000 - $180,000 USD
H1B Sponsor
Key skills
AzureETLPySparkSQLData EngineeringAnalytics
About this role
Role Overview
Design and build ETL pipelines using PySpark, SQL, and Azure data services to process healthcare data from multiple source systems
Execute data extraction and transformation operations on complex healthcare datasets, ensuring accuracy and compliance with established standards
Develop data quality validation frameworks to identify and resolve issues during integration, QC, and backtesting phases
Troubleshoot technical issues including data schema mismatches, transformation logic errors, and performance bottlenecks
Build reusable data components and standardized integration patterns that accelerate future implementations
Optimize pipeline performance for large-scale healthcare datasets, ensuring efficient processing and resource utilization
Implement data validation rules specific to healthcare contexts (e.g., clinical code validation, temporal logic checks, referential integrity)
Write and maintain technical documentation for data pipelines, transformations, and integration patterns
Support production deployments by coordinating with infrastructure teams and conducting final testing
Requirements
5+ years of experience in data analytics, data engineering, or solution delivery roles, with demonstrated expertise in data integration and ETL processes
Strong analytical toolkit with proficiency in:
PySpark for distributed data processing
Advanced SQL for data querying and transformation
Excel for data analysis and reporting
Production ETL experience: Track record of building and maintaining production-grade data pipelines with proper error handling and monitoring
Data quality focus: Experience implementing validation frameworks and troubleshooting data quality issues
Healthcare data experience: Prior work with healthcare datasets (EHR, claims, clinical, lab data)
Problem-solving mindset: Ability to independently diagnose and resolve complex technical issues
Attention to detail: Commitment to accuracy, testing, and delivering reliable solutions
Collaborative working style: Comfortable partnering with non-technical colleagues and adapting to feedback
Bachelor's degree in Computer Science, Engineering, Data Science, Mathematics, or related technical field