Tech Genius inc is seeking an Azure Databricks Engineer to design and build data pipelines. The role involves developing data ingestion pipelines, managing cluster configurations, and ensuring data quality and validation within the Azure Databricks platform.

Responsibilities:

Develop scalable and efficient data ingestion pipelines using Azure Databricks and Apache Spark. Working knowledge of DBT a plus
Create and maintain data transformation scripts and notebooks using Spark with Python, SQL, or Scala
Implement ELT/ETL workflows to prepare data for analytics and reporting
Configure Azure Databricks clusters for optimal performance, cost-efficiency, and resource utilization
Manage cluster policies, including autoscaling, instance types and runtime versions
Monitor cluster health and tune Spark configurations to improve job execution times
Implement data quality checks and validation processes within Databricks notebooks and workflows
Develop and maintain CI/CD pipelines for automated deployment of Databricks notebooks, jobs, and configurations
Monitor Databricks jobs, clusters, and pipelines for failures or performance bottlenecks. Platform usage metrics
Troubleshoot and resolve issues related to data processing, cluster performance, job execution and platform related issues
Track and analyze Databricks usage and costs
Optimize cluster configurations and usage patterns to reduce expenses
Implement policies to control resource consumption and prevent cost overruns
Provide technical support and guidance on Databricks platform usage and best practices

Requirements:

Develop scalable and efficient data ingestion pipelines using Azure Databricks and Apache Spark
Create and maintain data transformation scripts and notebooks using Spark with Python, SQL, or Scala
Implement ELT/ETL workflows to prepare data for analytics and reporting
Configure Azure Databricks clusters for optimal performance, cost-efficiency, and resource utilization
Manage cluster policies, including autoscaling, instance types and runtime versions
Monitor cluster health and tune Spark configurations to improve job execution times
Implement data quality checks and validation processes within Databricks notebooks and workflows
Develop and maintain CI/CD pipelines for automated deployment of Databricks notebooks, jobs, and configurations
Monitor Databricks jobs, clusters, and pipelines for failures or performance bottlenecks
Troubleshoot and resolve issues related to data processing, cluster performance, job execution and platform related issues
Track and analyze Databricks usage and costs
Optimize cluster configurations and usage patterns to reduce expenses
Implement policies to control resource consumption and prevent cost overruns
Provide technical support and guidance on Databricks platform usage and best practices
Working knowledge of DBT a plus

Azure Databricks Engineer on our W2

Key skills

About this role

Responsibilities:

Requirements: