GENNTE Technologies is seeking a Data Engineer with strong Databricks expertise to design, build, and optimize scalable data pipelines on modern cloud data platforms. The ideal candidate will be responsible for implementing data ingestion, transformation, and integration workflows, ensuring data quality and performance across all pipelines while collaborating with data architects and analysts.
Responsibilities:
- Design, develop, and optimize data pipelines using Databricks, Apache Spark, and Delta Lake
- Implement data ingestion, transformation, and integration workflows for large-scale datasets
- Ensure data quality, reliability, and performance across all pipelines
- Collaborate with data architects and analysts to enable analytics and reporting capabilities
- Apply best practices for Lakehouse architecture and cloud-based data engineering
- Monitor and troubleshoot pipeline performance, ensuring scalability and efficiency
- Work within an Agile framework for sprint planning, backlog management, and continuous delivery
Requirements:
- 5+ years of experience in data engineering with strong focus on Databricks
- Hands-on expertise in Apache Spark, Delta Lake, and Databricks Lakehouse architecture
- Strong knowledge of Python, PySpark, and SQL
- Experience with cloud platforms (Azure preferred) and data integration tools
- Strong SQL skills (joins, window functions, optimization)
- Experience with cloud platforms (Azure / AWS / GCP)
- Knowledge of ETL/ELT, CDC, SCD (Type 1 & 2)
- Familiarity with CI/CD pipelines and DevOps practices
- Understanding of data governance, metadata management, and performance tuning
- Excellent problem-solving and communication skills
- Bachelor's degree in computer science, Data Engineering, or related field
- Experience with Azure Data Factory, Azure DevOps, and cloud-native architectures
- Knowledge of data modeling and analytics enablement