HYR Global Source Inc is seeking a skilled Data Engineer to join their team on a 12-month remote contract. The ideal candidate will have strong hands-on experience building scalable data pipelines and working with modern cloud-based data platforms.
Responsibilities:
- Design, develop, and maintain scalable ETL/data pipelines in cloud environments
- Build and manage both batch and real-time (streaming) data processing workflows
- Develop and optimize data transformations using Python, SQL, and Apache Spark
- Implement robust data solutions using Databricks and Azure Data Factory
- Ensure data pipeline reliability, scalability, and performance through monitoring and optimization
- Collaborate with cross-functional teams, including analysts, BI developers, and DevOps engineers
- Support reporting and analytics initiatives by enabling clean, well-structured datasets for BI tools
- Apply best practices in data quality, testing, version control, and CI/CD automation
Requirements:
- Strong hands-on experience building scalable data pipelines
- Working with modern cloud-based data platforms
- Expertise in data processing, automation, and performance optimization across large-scale data systems
- Design, develop, and maintain scalable ETL/data pipelines in cloud environments
- Build and manage both batch and real-time (streaming) data processing workflows
- Develop and optimize data transformations using Python, SQL, and Apache Spark
- Implement robust data solutions using Databricks and Azure Data Factory
- Ensure data pipeline reliability, scalability, and performance through monitoring and optimization
- Collaborate with cross-functional teams, including analysts, BI developers, and DevOps engineers
- Support reporting and analytics initiatives by enabling clean, well-structured datasets for BI tools
- Apply best practices in data quality, testing, version control, and CI/CD automation
- Programming: Python, SQL
- Data Processing: Apache Spark, Databricks, Azure Data Factory
- Streaming Technologies: Kafka, Google Pub/Sub (or similar messaging systems)
- DevOps & Automation: Git, Jenkins or GitHub Actions, Azure DevOps, Terraform, Docker
- Visualization & Reporting: Power BI, Tableau, or similar BI tools
- Experience with data lake and/or data warehouse architectures (e.g., Lakehouse concepts)
- Knowledge of data governance frameworks and data quality tools
- Familiarity with CI/CD pipelines and automated data testing best practices