Bounteous is a premier end-to-end digital transformation consultancy dedicated to partnering with ambitious brands to create digital solutions for today’s complex challenges and tomorrow’s opportunities. They are seeking a Senior Databricks Engineer to design, build, and optimize large-scale data and analytics platforms on the Databricks Lakehouse, owning the architecture and delivery of production-grade data pipelines while collaborating with analytics and data science teams.
Responsibilities:
- Architect, build, and maintain scalable ETL/ELT pipelines on the Databricks Lakehouse Platform using PySpark, Spark SQL, and Delta Lake
- Design and implement medallion (bronze/silver/gold) data architectures and enforce data quality, governance, and lineage standards
- Optimize Spark jobs and cluster configurations for performance and cost, including partitioning, caching, and autoscaling strategies
- Implement and manage Unity Catalog for access control, data governance, and cross-workspace asset sharing
- Build and orchestrate workflows using Databricks Workflows, Delta Live Tables, and CI/CD pipelines
- Collaborate with data scientists, analysts, and business stakeholders to translate requirements into reliable data products
- Establish engineering best practices, conduct code reviews, and mentor junior data engineers
- Monitor production pipelines, troubleshoot failures, and drive root-cause analysis and continuous improvement
Requirements:
- 5+ years of data engineering experience, with 3+ years building production solutions on Databricks and Apache Spark
- Expert proficiency in Python (PySpark) and advanced SQL
- Deep hands-on experience with Delta Lake, Unity Catalog, and the medallion architecture pattern
- Strong experience with at least one major cloud platform (AWS, Azure, or GCP) and its core data services
- Proven track record optimizing Spark performance and managing cluster cost
- Experience with data modeling, warehousing concepts, and building dimensional/analytics-ready datasets
- Proficiency with Git-based version control, CI/CD, and infrastructure-as-code
- Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
- Databricks certification (Data Engineer Associate/Professional)
- Experience with Delta Live Tables, structured streaming, and real-time data processing
- Familiarity with MLflow and supporting machine learning workflows in production
- Experience with orchestration tools (Airflow, dbt) and data observability platforms
- Exposure to data governance, security, and compliance frameworks (e.g., GDPR, HIPAA, SOC 2)
- Hands-on experience using AI coding assistants (e.g., Claude Code, GitHub Copilot, Cursor) to accelerate development, refactoring, and code review
- Familiarity with large language model APIs and SDKs (e.g., Anthropic Claude, OpenAI) and prompt engineering for data and analytics use cases
- Experience integrating GenAI capabilities into data pipelines or applications, including retrieval-augmented generation (RAG) and vector search
- Awareness of responsible AI practices, including evaluation, guardrails, and cost/latency trade-offs when deploying LLM-based solutions