Exactera is a FinTech SaaS start-up founded in 2016, focused on providing corporate tax solutions powered by AI and cloud-based technologies. The Lead Data Platform Engineer will architect and implement a centralized data platform on Databricks, optimizing for cost and performance while enabling Data Engineers to build confidently on the platform.
Responsibilities:
- Design data models and implement unified schemas across multiple disparate product lines
- Design and implement multi-catalog governance strategy supporting data isolation, cross-product data sharing, and comprehensive lineage tracking across our product portfolio
- Establish patterns for Z-ordering, compaction, and liquid clustering at multi-TB scale. Define table structures, partitioning strategies, and retention policies that balance query performance with storage costs
- Build declarative pipeline patterns using Delta Live Tables. Create orchestration workflows for ingesting data from internal sources such as SQL databases and S3
- Integrate with third party data sources such as ERP systems (Netsuite etc.) and external data providers (S&P etc.) with automated ingest, robust error handling and monitoring
- Implement cost monitoring and optimization strategies, establish data quality frameworks, create self-service patterns enabling Data Engineers to work independently while maintaining governance standards
- Lead the architecture for migrating multi-terabyte datasets from legacy systems to Databricks—establishing patterns that will be reused across multiple product lines
- Design Unity Catalog structures enabling secure data separation between product lines while allowing controlled cross-product analytics where appropriate
- Build infrastructure that scales efficiently—through intelligent caching, query optimization, and compute management strategies that avoid linear cost growth
- Establish monitoring, alerting, and data quality validation ensuring the platform operates reliably as foundation for both analytics and AI workloads
Requirements:
- Databricks Expertise (Required)
- Unity Catalog: Production experience with multi-catalog governance, metastore design, and lineage tracking
- Data Structuring: Experience designing and building unified schemas across multiple disparate product lines
- Delta Lake: Expert-level experience with Z-ordering, compaction, liquid clustering, and performance tuning at multi-TB scale
- Delta Live Tables: Strong hands-on experience building declarative ETL pipelines, including change data capture and expectations/constraints
- Databricks Workflows: Experience with job orchestration, scheduling, and operational monitoring
- Business Intelligence: Experience enabling company-wide analytics and reporting with modern business intelligence tools and maintaining source of truth data and metrics
- PySpark & Databricks SQL: Strong proficiency for code review, performance tuning, and query optimization
- Core Platform Engineering: 5-8 years in data engineering or data platform roles, with 3+ years hands-on Databricks experience
- Track record leading at least one significant platform build or migration project
- AWS experience (S3, IAM, VPC) with ability to collaborate on infrastructure decisions
- Infrastructure-as-code experience (Terraform preferred)
- Technical Leadership: Demonstrated ability architecting data platforms from first principles and defending technical decisions
- Strong written and verbal communication— document architecture decisions and present to both technical and business stakeholders
- Experience with financial data, accounting systems (NetSuite), or enterprise ERP platforms
- Background building platforms that serve AI/ML workloads (experience preparing data for downstream ML consumption, RAG and retrieval, and LLMs
- Understand advanced intelligence concepts such as relationship surfacing with knowledge graphs
- Familiarity with data governance frameworks and compliance requirements for regulated industries