Connect Tech+Talent is seeking a Data Scientist (Big Data Engineer) 2 to work on a 5-month contract. The role involves implementing ETL/ELT workflows, automating deployments, and collaborating with cross-functional teams to design and maintain data models and storage solutions.

Responsibilities:

Implement ETL/ELT workflows for both structured and unstructured data
Automate deployments using CI/CD tools
Collaborate with cross-functional teams including data scientists, analysts, and stakeholders
Design and maintain data models, schemas, and database structures to support analytical and operational use cases
Evaluate and implement appropriate data storage solutions, including data lakes (Azure Data Lake Storage) and data warehouses
Implement data validation and quality checks to ensure accuracy and consistency
Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging
Implement data security measures, including encryption, access controls, and auditing; ensure compliance with regulations and best practices
Proficiency in Python and R programming languages
Strong SQL querying and data manipulation skills
Experience with Azure cloud platform
Experience with DevOps, CI/CD pipelines, and version control systems
Working in agile, multicultural environments
Strong troubleshooting and debugging capabilities
Design and develop scalable data pipelines using Apache Spark on Databricks
Optimize Spark jobs for performance and cost-efficiency
Integrate Databricks solutions with cloud services (Azure Data Factory)
Ensure data quality, governance, and security using Unity Catalog or Delta Lake
Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake

Requirements:

4 Years - Implement ETL/ELT workflows for both structured and unstructured data
4 Years - Automate deployments using CI/CD tools
4 Years - Collaborate with cross-functional teams including data scientists, analysts, and stakeholders
4 Years - Design and maintain data models, schemas, and database structures to support analytical and operational use cases
4 Years - Evaluate and implement appropriate data storage solutions, including data lakes (Azure Data Lake Storage) and data warehouses
4 Years - Implement data validation and quality checks to ensure accuracy and consistency
4 Years - Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging
4 Years - Implement data security measures, including encryption, access controls, and auditing; ensure compliance with regulations and best practices
4 Years - Proficiency in Python and R programming languages
4 Years - Strong SQL querying and data manipulation skills
4 Years - Experience with Azure cloud platform
4 Years - Experience with DevOps, CI/CD pipelines, and version control systems
4 Years - Working in agile, multicultural environments
4 Years - Strong troubleshooting and debugging capabilities
3 Years - Design and develop scalable data pipelines using Apache Spark on Databricks
3 Years - Optimize Spark jobs for performance and cost-efficiency
3 Years - Integrate Databricks solutions with cloud services (Azure Data Factory)
3 Years - Ensure data quality, governance, and security using Unity Catalog or Delta Lake
3 Years - Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
3 Years - Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake
1 Year - Knowledge of ML libraries (MLflow, Scikit-learn, TensorFlow)
1 Year - Databricks Certified Associate Developer for Apache Spark
1 Year - Azure Data Engineer Associate

Data Scientist (Big Data Engineer) 2 (C0022)

Key skills

About this role

Responsibilities:

Requirements: