Fora Financial is a technology-enabled provider of flexible financing to small and medium-sized businesses. They are seeking a Staff Data Engineer to lead the modernization of their legacy stack and build a platform for AI-native analytics, focusing on architecture, strategy, execution, and data quality.
Responsibilities:
- Data platform architecture: ingestion patterns, warehouse design, environment strategy, orchestration, access governance, and reliability standards
- Freshness strategy: deciding which data needs real-time, near-real-time, daily, or ad hoc refreshes — and designing accordingly
- Streaming vs. batch decisions: making pragmatic tradeoffs across business value, cost, complexity, failure modes, and operational burden
- Source ingestion: batch, incremental, API-based, file-based, CDC, and streaming patterns where they make sense
- Pipeline reliability: dependencies, retries, alerts, backfills, incident response, runbooks, monitoring, and support expectations
- New source onboarding: requirements → source profiling → ingestion design → QA → documentation → support ownership
- Legacy migration: helping retire brittle reporting paths such as Azure Data Factory, SQL backup workflows, TRS Daily, and other duplicate pipelines
- Snowflake platform operations: roles, permissions, service accounts, connector ownership, environment separation, performance, cost, and governance
- Data contracts: schema-change handling, new-field availability, upstream SLAs, source defects, and escalation paths
- Data quality and observability: freshness, volume movement, nulls, duplicates, reconciliation, anomaly detection, and critical business-rule checks
- AI-enabled leverage: using AI and automation to improve debugging, documentation, pipeline scaffolding, testing, monitoring, and operational workflows
Requirements:
- Deep data engineering judgment. You have designed, built, and operated production platforms, not just individual pipelines
- Hands-on depth. You seamlessly move from architecture discussions to Python, SQL, deployment scripts, and production debugging
- Strong ingestion fundamentals. APIs, CDC, backfills, idempotency, schema drift, and failure recovery
- Snowflake fluency. Warehouse design, RBAC, performance tuning, and cost controls
- Data quality discipline. You know which checks matter and make quality visible before users find issues
- Independent ownership & communication. You can sequence ambiguous work, write useful design docs, align technical decisions with business outcomes, and carry problems to resolution
- AI-native leverage. You actively use LLMs and agents to accelerate engineering work without outsourcing judgment
- Lending, fintech, or financial-services data experience
- CDC, Debezium, Fivetran, Airbyte, Azure Data Factory, dbt Cloud, Dagster, Airflow, Prefect, or equivalent tooling
- Snowflake performance tuning, RBAC, data sharing, warehouse cost optimization, or Iceberg
- Data observability with Monte Carlo, Elementary, dbt tests, custom monitors, or similar
- Data contracts, source SLAs, or schema-change processes with Engineering teams
- AI-native analytics, semantic layers, MCP servers, agent QA, or governed context retrieval
- Lightweight internal tools, scripts, or agents that reduce repetitive platform work