Daman is a company seeking a Databricks Engineer responsible for developing, maintaining, and optimizing big data solutions using the Databricks Unified Analytics Platform. The role supports data engineering, machine learning, and analytics initiatives, focusing on large-scale data processing.
Responsibilities:
- Designing and developing scalable data pipelines
- Implementing ETL/ELT workflows
- Optimizing Spark jobs
- Integrating with Azure Data Factory
- Automating deployments
- Collaborating with cross-functional teams
- Ensuring data quality, governance, and security
Requirements:
- Implement ETL/ELT workflows for both structured and unstructured data
- Automate deployments using CI/CD tools
- Collaborate with cross-functional teams including data scientists, analysts, and stakeholders
- Design and maintain data models, schemas, and database structures to support analytical and operational use cases
- Evaluate and implement appropriate data storage solutions, including data lakes (Azure Data Lake Storage) and data warehouses
- Implement data validation and quality checks to ensure accuracy and consistency
- Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging
- Implement data security measures, including encryption, access controls, and auditing; ensure compliance with regulations and best practices
- Proficiency in Python and R programming languages
- Strong SQL querying and data manipulation skills
- Experience with Azure cloud platform
- Experience with DevOps, CI/CD pipelines, and version control systems
- Working in agile, multicultural environments
- Strong troubleshooting and debugging capabilities
- Design and develop scalable data pipelines using Apache Spark on Databricks
- Optimize Spark jobs for performance and cost-efficiency
- Integrate Databricks solutions with cloud services (Azure Data Factory)
- Ensure data quality, governance, and security using Unity Catalog or Delta Lake
- Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
- Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake
- Knowledge of ML libraries (MLflow, Scikit-learn, TensorFlow)
- Databricks Certified Associate Developer for Apache Spark
- Azure Data Engineer Associate