E-IT is seeking a Data Quality Engineer for supporting a Provider 360 data product program for a leading healthcare client. In this role, you will join a cross-functional team to deliver a high-quality data pipeline and data quality framework, focusing on building ingestion pipelines and enforcing data quality rules.
Responsibilities:
- Design & Build Ingestion Pipelines – Develop end-to-end data pipelines from source systems through bronze, silver, and gold layers, adhering to a medallion (multi-tier) data architecture. This includes setting up source data ingestion (bronze layer) with initial quality checks (schema validation, completeness), transforming and refining data in intermediate layers (silver), and preparing curated datasets in the gold layer aligned with business use cases
- Implement Data Quality Controls – Define and embed data quality rules into the pipelines (e.g. checks for data completeness, consistency, accuracy) and configure threshold-based alerts for data quality metrics. Ensure that any data anomalies trigger logging and notifications, with mechanisms for handling and recovering from data errors (e.g. retry logic, error handling procedures)
- Testing & Validation – Establish a robust testing framework for the data pipeline. Develop automated unit tests to cover at least 90% of data transformation logic, and create integration tests to validate end-to-end data flows and dependencies. Collaborate on performance testing (throughput, latency) to ensure the data pipelines meet or exceed SLAs and can scale for future growth
- Metadata Management & Documentation – Contribute to the data dictionary and metadata standards. Document all critical data elements and transformations: capturing field definitions, data types, sources, and owners for the gold layer (and relevant bronze/silver fields). Help establish clear metadata conventions (naming standards, data lineage, data quality metrics) and ensure that all documentation (dictionary, lineage, quality rules) is reviewed and approved by project stakeholders
- Collaboration & Agile Delivery – Work closely with Data Engineers, Data Modeler, and Tech Lead in an Agile environment to meet sprint commitments and project milestones. Communicate progress, issues, and solutions effectively with both technical team members and project leadership, aligning with best practices and project standards for data engineering
Requirements:
- Experience in developing end-to-end data pipelines from source systems through bronze, silver, and gold layers
- Knowledge of medallion (multi-tier) data architecture
- Ability to implement data quality controls, including checks for data completeness, consistency, and accuracy
- Experience in defining and embedding data quality rules into data pipelines
- Proficiency in establishing a robust testing framework for data pipelines
- Experience in developing automated unit tests covering at least 90% of data transformation logic
- Ability to create integration tests to validate end-to-end data flows and dependencies
- Experience in performance testing (throughput, latency) to ensure data pipelines meet or exceed SLAs
- Knowledge of metadata management and documentation standards
- Ability to document critical data elements and transformations, including field definitions, data types, sources, and owners
- Experience in establishing clear metadata conventions and ensuring documentation is reviewed and approved by stakeholders
- Ability to collaborate effectively in an Agile environment with cross-functional teams