Role Overview

Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.
Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.
Apply schema evolution and data versioning to support agile data development.
Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.
Implement connectors and ingestion frameworks that accommodate structured, semi-structured, and unstructured data.
Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers.
Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures.
Enforce data security best practices including row-level security, encryption at rest/in transit, and fine-grained access control via Unity Catalog.
Enable data scientists by delivering high-quality, feature-rich data sets for model training and inference.
Collaborate with AI/ML teams to create reusable feature stores and training pipelines.
Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components.
Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers.

Requirements

Hands-on experience with Databricks, Delta Lake, and Apache Spark for large-scale data engineering.
Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.
Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.
Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.
Familiarity with data governance, lineage tracking, and metadata management tools.

Tech Stack

Apache
Cloud
ETL
Grafana
Python
Scala
Spark
SQL
Unity

Benefits

40 hours per week (no overtime)

Databricks Engineer

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits