Fuel Cycle is a market research disruptor that empowers organizations with agile research solutions. They are seeking a Senior Data Engineer to design and build a Databricks-first data lake and pipeline infrastructure that will support future AI products.

Responsibilities:

A multi-tenant data lake and warehouse that unifies all research data — surveys, qualitative feedback, CRM, media transcripts, and more — in a structured, AI-consumable format
Tenant-isolated data architecture where enterprise client data is structurally separated at the storage and query layer
Provenance-aware data models where every data point carries full traceability back to its source
Batch ingestion pipelines that migrate and continuously sync data from existing relational databases and cloud storage into the new lake architecture
A nightly profile enrichment pipeline that rebuilds living user profiles from all data sources within each client account
Data access layers serving AI agents via MCP, qualitative search via RAG pipelines, statistical computation tools, REST APIs, and bulk export
Quickly integrates with the engineering team and contributes meaningfully to the data platform build
Takes ownership of assigned pipeline and infrastructure work end-to-end, from design through production
Brings architectural recommendations and solutions proactively, rather than waiting for direction
Demonstrates strong collaboration and communication across engineering and product teams

Requirements:

5+ years of deep, hands-on experience building production lakehouses on Databricks
You write clean PySpark and Python, model data thoughtfully, and know how to build for a multi-tenant SaaS environment
Deep production experience across the Databricks platform including Unity Catalog, Delta Live Tables, Databricks SQL, and Workflows
Delta Lake as a production table format — ACID transactions, schema evolution, performance optimization, and multi-tenant governance via Unity Catalog
Experience building and maintaining dbt transformation projects using the Databricks adapter in a production environment
PySpark for large-scale data transformation and batch pipeline authoring
Strong understanding of batch ingestion pipeline design — migrating from relational sources like MySQL and PostgreSQL into a lakehouse architecture
Experience with a modern pipeline orchestrator such as Dagster, Prefect, or Databricks Workflows; Dagster experience is a strong positive
Familiarity with vector databases, embedding pipelines, and RAG patterns for AI workloads — using tools such as Databricks Vector Search, pgvector, or Amazon OpenSearch
Exposure to AI agent and LLM-serving infrastructure including Amazon Bedrock, AgentCore, and Strands
Experience with data cataloging and governance tools such as Unity Catalog or OpenMetadata
Data modeling for multi-tenant analytical workloads — partitioning strategy, schema design, and tenant isolation patterns
Databricks on AWS — workspace configuration, S3 integration, IAM, and cost governance
Infrastructure as code using Databricks Asset Bundles or Terraform
Strong Python and SQL skills
Proactive Ownership: You bring recommendations and solutions to your manager — you don't wait to be told what to do
Architectural Judgment: You have the judgment to make the right foundational decisions and defend them
Greenfield Builder: You thrive on greenfield builds and take full ownership from design through to production
Comfort with Ambiguity: You are comfortable with ambiguity and can translate high-level vision into a concrete engineering plan
Outsized Impact: You understand that on a small team your decisions have outsized and lasting impact
Databricks certifications — Data Engineer Associate or Professional
Salesforce or CRM data integration experience
Prior experience in a multi-tenant SaaS environment with strict data isolation requirements
Experience migrating from OLTP to a lakehouse architecture
Candidates with experience in AWS-native data services are strongly valued. Engineers who understand both Databricks and AWS-native approaches bring a broader architectural perspective that helps the team make better long-term platform decisions
Apache Iceberg, AWS Glue, Athena, and DynamoDB experience

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: