Build and maintain automated data validation frameworks to ensure accuracy, completeness, and integrity across ETL pipelines and downstream consumer systems
Develop Python based automated tests for source to target validation, business logic verification, schema enforcement, anomaly detection, and historical data consistency
Validate data logic and end to end flows across batch and streaming data systems
Embed automated data quality checks into CI/CD pipelines to enable continuous validation, high signal to noise defect detection, and rapid feedback across environments
Collaborate with product, data engineering, analytics, and platform teams to define data quality requirements, validation criteria, and coverage strategies
Investigate data issues across pipelines and distributed systems, perform detailed root cause analysis using logs and data snapshots, and document actionable, reproducible defects in JIRA
Requirements
5+ years of Python development experience in data focused environments
2+ years leading automation or data quality initiatives within an agile setting
Expertise designing and implementing CI/CD based automation pipelines for data validation
Strong background in developing or architecting testing frameworks for large-scale or distributed data systems
Direct experience validating complex ETL systems, transformation pipelines, and analytics/reporting layers
Proficiency with SQL
Tech Stack
Distributed Systems
ETL
Python
SQL
Benefits
Employee discounts on awesome tech from day one
Flexible health benefits and wellness program
TFSA and RRSP programs
100% matched company pension plan
Training programs to build new and transferable skills