Exactera is a FinTech SaaS start-up founded in 2016, specializing in corporate tax solutions powered by AI and cloud-based technologies. They are seeking a Staff Data Engineer to provide technical leadership for the data engineering team, focusing on building and maintaining production data pipelines and driving architectural decisions as the platform scales post-migration.
Responsibilities:
- Build and maintain production data pipelines within the patterns and governance established by the Lead Data Platform Engineer, ensuring reliability and performance at multi-terabyte scale
- Exercise architectural judgment on data modeling, pipeline design, and platform usage—translating complex business requirements into scalable data solutions across our product portfolio
- Engage proactively with product and engineering stakeholders to translate requirements into data solutions, serving as the primary onshore technical point of contact for data engineering needs
- Drive platform quality through code reviews, testing practices, and engineering standards that ensure the team delivers reliable, maintainable data infrastructure
- Serve as onshore escalation point and institutional knowledge backup for platform decisions, reducing single-point-of-failure risk and building onshore technical depth as the platform scales
- Implement data pipelines that serve multiple product lines (Transfer Pricing, R&D Services, RoyaltyStat, Provisioning) with distinct data requirements, ensuring each product gets the data it needs reliably and on schedule
- Lead pipeline implementation for migrating multi-terabyte datasets from legacy systems to Databricks, working within the architecture defined by the Lead Data Platform Engineer
- Provide the senior judgment layer the current nearshore team cannot—owning problems end-to-end, making independent architectural decisions, and mentoring engineers to raise the quality bar across the team
- Bridge the gap between product teams and data infrastructure, translating business requirements into data solutions and ensuring the data platform delivers on product commitments
Requirements:
- SQL, Python, and PySpark—production pipeline implementation and performance optimization
- Databricks experience—Delta Lake, Workflows, and Databricks SQL; Unity Catalog familiarity preferred
- 5+ years in data engineering with demonstrated ability to own problems end-to-end without close direction
- Experience building and maintaining ETL/ELT pipelines at scale, including error handling, monitoring, and data quality validation
- Strong data modeling skills across structured and semi-structured data sources
- AWS experience (S3, IAM, VPC) with ability to collaborate on infrastructure decisions
- Infrastructure-as-code experience (Terraform preferred)
- Familiarity with data governance patterns (Unity Catalog, data lineage, access controls)
- Demonstrated ability to exercise independent architectural judgment—not just ticket execution
- Experience mentoring or guiding junior and mid-level data engineers
- Strong written and verbal communication—able to document architecture decisions and engage directly with both technical and business stakeholders
- Onshore (US-based)—role requires timezone overlap, async-light communication, and direct stakeholder engagement
- Experience with financial data, accounting systems (NetSuite), or enterprise ERP platforms
- Background building pipelines that serve AI/ML workloads (preparing data for downstream ML consumption, RAG, and LLMs)
- Familiarity with data governance frameworks and compliance requirements for regulated industries
- Experience working alongside or transitioning from nearshore engineering teams