Job Summary
We are looking for a hands-on Lead Data Engineer with strong expertise in Databricks and proven experience in the retail domain. The ideal candidate should be a player-coach who can lead a team while actively contributing to design, development, and optimization of scalable data pipelines, along with strong communication skills to interact with stakeholders and business teams.
Key Responsibilities
- Design and build scalable data pipelines using PySpark and SQL on Databricks
- Work with large-scale retail data (transactions, customer, product, inventory)
- Implement data models using Lakehouse architecture
- Lead a team of engineers while remaining hands-on in coding.
- Collaborate with cross-functional teams and clearly communicate technical solutions to non-technical stakeholders
- Optimize performance of ETL/ELT pipelines
- Ensure data quality, governance, and best practices
Required Skills
- Strong hands-on experience with Databricks (must-have)
- Expertise in PySpark, Spark SQL
- Strong coding skills in Python and SQL
- Experience with Delta Lake / Lakehouse architecture
- Proven experience in building scalable data pipelines
- Excellent communication and stakeholder management skills
- Ability to translate business requirements into technical solutions
Location Requirement
- Candidates must be local to Dallas, TX or nearby. No relocation.