Empiric is a premier technology consultancy seeking a high-caliber Databricks Consultant/Engineer to lead the design and implementation of mission-critical Data and AI ecosystems. The role focuses on Application-Level Disaster Recovery and Automated Governance for enterprise-scale clients, ensuring data pipelines and ML models are fully restorable in high-stakes environments.
Responsibilities:
- Design and implement Active-Passive DR strategies using specialized DR Orchestration Runbooks
- Ensure full-stack recovery beyond simple infra: restore data pipelines, MLflow models, metadata, and Feature Stores
- Architect cross-region replication patterns specifically for ADLS and Delta Lake
- Automate DR orchestration using Terraform, ARM, or Bicep
- Solve complex retention/archival problems using the Archival & Retention Orchestrator
- Enforce regulatory retention and automated deletion to ensure audit-ready compliance
- Optimize VACUUM retention tuning and Delta table versioning
- Manage end-to-end data lineage and governance within Unity Catalog for both structured and unstructured data
- Build and maintain robust automation via Databricks Workflows and Azure Data Factory (ADF)
- Implement compliance-aware designs, including immutable records and legal hold patterns
- Manage ADLS lifecycle policies, including tiering, archival, and deletion automation
Requirements:
- Deep expertise in Delta Lake, Unity Catalog, MLflow, and Databricks Workflows
- Proficiency in Terraform, Bicep, or ARM templates for automated deployments
- Strong experience with Azure Data Lake Storage (ADLS) and Azure Data Factory (ADF); familiarity with Microsoft Fabric is a plus
- Proven track record of designing cross-region replication and DR runbooks
- Experience with compliance-heavy environments (HIPAA, GDPR, or Financial regs) focusing on retention and immutable storage
- Ability to translate complex DR and Governance needs into clear roadmaps for stakeholders
- High level of discipline for a fully remote, synchronous EST/CST work environment
- You don't just move data; you protect its integrity and lifecycle