Role Overview

Own data platform architecture and technical direction: Lead the design of systems that span the full pipeline stack — raw ingestion, streaming and batch transformations, analytical models, and the serving layer that downstream consumers depend on. Make architectural decisions that balance reliability, performance, cost, and long-term maintainability. Set the patterns and standards that other engineers follow.
Lead the hardest cross-functional technical problems: Drive complex initiatives that span multiple teams and services. Define data contracts with upstream producers, lead schema evolution strategies, and resolve systemic friction between data producers and consumers. Be the person who steps in when a problem is too ambiguous or cross-cutting for a single team to solve.
Raise the engineering bar across the team: Set and enforce standards for data modeling, pipeline reliability, testing practices, code quality, and operational excellence. Mentor senior engineers through design reviews, pairing, and technical coaching. Influence how the team thinks about building systems, not just what they build.
Keep the platform reliable and the data trustworthy: Own the reliability posture of the most critical data systems. Define and drive SLAs and SLOs for key pipelines, lead incident response for complex data failures, and drive the systemic fixes — not just the immediate patches. Champion observability, data quality, and operational rigor as first-class concerns.
Make it easy for teams to own their data: Build tooling and establish practices that enable service and application teams to effectively manage their data. Coach teams on compliance strategies, performance tuning, event-driven design, and schema evolution so they can take ownership without creating bottlenecks on the data team.
Deliver with ownership and grit: Take the most ambiguous, highest-stakes projects from problem definition to production. Work through technical blockers, cross-functional dependencies, and competing priorities. Keep stakeholders informed and build a track record of delivering high-quality data assets that teams trust and depend on.

Requirements

Bachelor’s Degree (or equivalent) in Computer Science, Engineering, or a related technical field.
8+ years of data engineering experience with a proven track record of building and operating reliable production data platforms at scale.
5+ years of strong proficiency in Python and SQL.
5+ years of experience with distributed data processing frameworks (e.g., PySpark, Flink, Spark) and data warehouse design including star and snowflake schemas.
3+ years of experience deploying and operating pipelines in the cloud, including CI/CD, monitoring, and incident response.
Track record of leading complex, cross-functional technical initiatives — driving architectural decisions and delivering outcomes across team boundaries.
Experience mentoring engineers and setting engineering standards across a team.

Tech Stack

Cloud
PySpark
Python
Spark
SQL

Benefits

Inclusive healthcare and benefits: On top of comprehensive medical, dental, and vision coverage, we offer employees and their family members help with gender-affirming care, tools for family and fertility planning, and travel reimbursements if healthcare isn’t available where you live.
Planning for the future: Start saving for the future with our traditional or Roth 401k retirement plan options which include a 2% company match.
Modern life stipends: Manage your own learning and development
Grow with us through discounted company stock through our ESPP with easy payroll deductions.

Staff Data Engineer – Data & ML Platform

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits