Fuel Cycle empowers leading organizations with agile research solutions that deliver decision-ready insights. They are seeking a Senior Data Engineer to design and build the Databricks-first data lake and pipeline infrastructure essential for future AI products.

Responsibilities:

A multi-tenant data lake and warehouse that unifies all research data — surveys, qualitative feedback, CRM, media transcripts, and more — in a structured, AI-consumable format
Tenant-isolated data architecture where enterprise client data is structurally separated at the storage and query layer
Provenance-aware data models where every data point carries full traceability back to its source
Batch ingestion pipelines that migrate and continuously sync data from existing relational databases and cloud storage into the new lake architecture
A nightly profile enrichment pipeline that rebuilds living user profiles from all data sources within each client account
Data access layers serving AI agents via MCP, qualitative search via RAG pipelines, statistical computation tools, REST APIs, and bulk export

Requirements:

5+ years of deep, hands-on experience building production lakehouses on Databricks
You write clean PySpark and Python, model data thoughtfully, and know how to build for a multi-tenant SaaS environment
Deep production experience across the Databricks platform including Unity Catalog, Delta Live Tables, Databricks SQL, and Workflows
Delta Lake as a production table format — ACID transactions, schema evolution, performance optimization, and multi-tenant governance via Unity Catalog
Experience building and maintaining dbt transformation projects using the Databricks adapter in a production environment
PySpark for large-scale data transformation and batch pipeline authoring
Strong understanding of batch ingestion pipeline design — migrating from relational sources like MySQL and PostgreSQL into a lakehouse architecture
Experience with a modern pipeline orchestrator such as Dagster, Prefect, or Databricks Workflows; Dagster experience is a strong positive
Familiarity with vector databases, embedding pipelines, and RAG patterns for AI workloads — using tools such as Databricks Vector Search, pgvector, or Amazon OpenSearch
Exposure to AI agent and LLM-serving infrastructure including Amazon Bedrock, AgentCore, and Strands
Experience with data cataloging and governance tools such as Unity Catalog or OpenMetadata
Data modeling for multi-tenant analytical workloads — partitioning strategy, schema design, and tenant isolation patterns
Databricks on AWS — workspace configuration, S3 integration, IAM, and cost governance
Infrastructure as code using Databricks Asset Bundles or Terraform
Strong Python and SQL skills
Databricks certifications — Data Engineer Associate or Professional
Salesforce or CRM data integration experience
Prior experience in a multi-tenant SaaS environment with strict data isolation requirements
Experience migrating from OLTP to a lakehouse architecture
Candidates with experience in AWS-native data services are strongly valued. Engineers who understand both Databricks and AWS-native approaches bring a broader architectural perspective that helps the team make better long-term platform decisions
Apache Iceberg, AWS Glue, Athena, and DynamoDB experience

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: