Own the architecture, deployment, and daily management of Databricks Unity Catalog across dev, test, and production environments
Define and enforce catalog structures, schemas, external locations, storage credentials, and metadata tagging standards
Design and implement fine-grained access control policies using Role-Based Access Control (RBAC) and Attribute-Based Access Control (ABAC)
Manage service principals, identity federation (IdP), system privileges, and secure secrets management via cloud Key Vaults integrated into Databricks secret scopes
Oversee multi-workspace administration, including cluster policies, compute governance, runtime versions, and SQL Warehouses
Build, optimize, and maintain CI/CD workflows for data assets using modern DevOps tooling (GitHub Actions, Azure DevOps, or GitLab CI)
Monitor platform usage, analyze audit logs, and establish dashboards to track platform spend
Implement autoscaling policies, spot instance strategies, and cluster-sizing guardrails to control cloud spend without impacting business SLAs
Manage and mentor a small team of Platform Engineers/DataOps analysts
Act as the primary incident manager for critical platform disruptions, defining SLIs/SLOs for data availability
Modern understanding of AI Agent architectures, Retrieval-Augmented Generation (RAG) pipelines, and open integration protocols like MCP (Model Context Protocol)
Requirements
10+ years in Data Operations, Platform Engineering, or Data Architecture
5+ years serving as a lead administrator/architect for enterprise-scale Databricks environments
Master’s degree in Mathematics, Computer Science, Electrical and Computer Engineering, or a closely related STEM field; or a BS with experience working with data technologies, is preferred
Deep architectural knowledge of Unity Catalog, including data lineage, row/column-level filtering, and external data shares (Delta Sharing)
Strong execution of cloud data security principles, including identity management, IAM policies, network security (VPC/VNet injection, Private Links), and compliance regulations (e.g., SOC2, HIPAA, or GDPR)
Expert-level knowledge of Terraform or cloud-native infrastructure automation
Proficient with Python, PySpark, and SQL for pipeline optimization, custom platform alerting, and script automation
Databricks Certified Platform Administrator or Databricks Certified Data Engineer Professional is preferred
Experience in data quality assurance, control and lineage for large datasets in relational/non-relational databases
Experience in medical device, healthcare, or manufacturing industries is desirable