Build and maintain the shared data access library and SDKs that Platform, Packaging, and Dataset API teams use to read from and write to multiple data sources (Snowflake, S3, RDS). Design interfaces that abstract source-level complexity while providing built-in auth, RBAC enforcement, pagination, and query governance.
Design and implement event-driven data flows using event brokers, CDC connectors, schema registry, event routing, dead letter queues. Make sure events flow reliably and failures are visible and recoverable.
Build the systems that track how data moves through the platform (lineage), enforce who can access what (governance and RBAC), and log what happened (auditing). This includes PII handling, retention policy enforcement, and audit infrastructure for enterprise and federal compliance.
Instrument the data platform with OpenTelemetry, define and monitor SLOs for query latency and pipeline success rates, and build alerting that catches issues before they become incidents. You will be on-call for the systems you build.
Contribute to infrastructure cost visibility and optimization
query cost estimation, workload right-sizing, and routing data to the most cost-effective storage tier for its access pattern.
Requirements
8+ years building platform infrastructure, data infrastructure, data platforms, or backend systems with significant data components. You have built and operated pipelines, data access layers, or ETL/ELT systems in production.
Strong proficiency in Python. Our stack is Python-heavy across Prefect, FastAPI, dbt, and the SDK layer.
Hands-on experience with SQL and at least two of: Snowflake, Redshift, Postgres. You understand the performance characteristics of each and can write queries that don't bring down production.
Experience with AWS — S3, RDS, EKS, EventBridge, IAM. Comfortable working in a Terraform-managed environment
Experience with Kubernetes. Our workloads run on EKS and you will deploy, debug, and scale services on K8s.
Familiarity with data orchestration tools (Prefect, Airflow, or Dagster) and transformation frameworks (dbt).
Understanding of data governance concepts — RBAC, PII handling, audit logging, data lineage.
Fluency with AI-assisted development tools (Claude Code, Cursor, or similar). This is a hard requirement — the team uses these tools daily and we expect engineers to leverage them for code generation, debugging, and investigation.