Job Description :
We are seeking a skilled Databricks Developer to join our team for an onsite opportunity with CRC Groups in Raleigh, NC. The ideal candidate will have strong experience in Databricks, Apache Spark, PySpark, and cloud-based data engineering solutions. The candidate will be responsible for developing, optimizing, and maintaining scalable data pipelines and supporting enterprise data initiatives.
Key Responsibilities
- Design, develop, and maintain data pipelines using Databricks and PySpark.
- Build and optimize ETL/ELT processes for large-scale data integration.
- Develop data transformation workflows using Spark SQL and Delta Lake.
- Work with structured and unstructured data from multiple sources.
- Implement data quality checks and monitoring processes.
- Optimize Databricks workloads for performance and cost efficiency.
- Collaborate with data architects, business analysts, and stakeholders to gather requirements and deliver solutions.
- Troubleshoot and resolve data pipeline and performance issues.
- Participate in code reviews and follow data engineering best practices.
- Maintain technical documentation and support production deployments.
Required Skills
- 5+ years of experience in Data Engineering and Big Data technologies.
- Strong hands-on experience with Databricks.
- Expertise in PySpark and Spark SQL.
- Experience with Delta Lake and Databricks Workflows.
- Strong SQL development and performance tuning skills.
- Experience with cloud platforms such as Azure, AWS, or Google Cloud Platform.
- Knowledge of ETL/ELT processes and data warehousing concepts.
- Experience with Git and CI/CD processes.
- Strong analytical and problem-solving skills.
Preferred Skills
- Experience with Structured Streaming.
- Knowledge of Unity Catalog and Delta Live Tables.
- Familiarity with Azure Data Factory, Airflow, or similar orchestration tools.
- Databricks certification is a plus.