Tech Genius inc is seeking an Azure Databricks Engineer for their client Guy Carpenter. The role involves designing and building data pipelines, optimizing cluster configurations, and ensuring data quality and validation within the Azure Databricks platform.

Responsibilities:

Design and Build Data Pipelines
Develop scalable and efficient data ingestion pipelines using Azure Databricks and Apache Spark. Working knowledge of DBT a plus
Data Transformation and Processing
Create and maintain data transformation scripts and notebooks using Spark with Python, SQL, or Scala
Implement ELT/ETL workflows to prepare data for analytics and reporting
Cluster Configuration and Optimization
Configure Azure Databricks clusters for optimal performance, cost-efficiency, and resource utilization
Manage cluster policies, including autoscaling, instance types and runtime versions
Monitor cluster health and tune Spark configurations to improve job execution times
Data Quality and Validation
Implement data quality checks and validation processes within Databricks notebooks and workflows
Automation and CI/CD
Develop and maintain CI/CD pipelines for automated deployment of Databricks notebooks, jobs, and configurations
Monitoring and Troubleshooting
Monitor Databricks jobs, clusters, and pipelines for failures or performance bottlenecks. Platform usage metrics
Troubleshoot and resolve issues related to data processing, cluster performance, job execution and platform related issues
Cost Management and Optimization
Track and analyze Databricks usage and costs
Optimize cluster configurations and usage patterns to reduce expenses
Implement policies to control resource consumption and prevent cost overruns
Collaboration and Support
Provide technical support and guidance on Databricks platform usage and best practices

Requirements:

Develop scalable and efficient data ingestion pipelines using Azure Databricks and Apache Spark
Working knowledge of DBT a plus
Create and maintain data transformation scripts and notebooks using Spark with Python, SQL, or Scala
Implement ELT/ETL workflows to prepare data for analytics and reporting
Configure Azure Databricks clusters for optimal performance, cost-efficiency, and resource utilization
Manage cluster policies, including autoscaling, instance types and runtime versions
Monitor cluster health and tune Spark configurations to improve job execution times
Implement data quality checks and validation processes within Databricks notebooks and workflows
Develop and maintain CI/CD pipelines for automated deployment of Databricks notebooks, jobs, and configurations
Monitor Databricks jobs, clusters, and pipelines for failures or performance bottlenecks
Troubleshoot and resolve issues related to data processing, cluster performance, job execution and platform related issues
Track and analyze Databricks usage and costs
Optimize cluster configurations and usage patterns to reduce expenses
Implement policies to control resource consumption and prevent cost overruns
Provide technical support and guidance on Databricks platform usage and best practices

Azure Databricks engineer

Key skills

About this role

Responsibilities:

Requirements: