Arena Club is pioneering the collectibles domain by introducing the first-ever digital card show. They are seeking a Senior Data Engineer to strengthen strategic decision-making, enhance operational performance, and integrate data across the company to unlock deeper insights into customer behavior and market performance.

Responsibilities:

Maintain and optimize inbound and outbound ETL pipelines built on AWS Glue (Python Shell & Spark ETL)
Manage Redshift cluster performance across various schemas
Own integrations with SaaS data sources via AppFlow and direct connectors
Operate outbound distribution pipelines to external vendors
Manage infrastructure, alerting, and migration state tracking
Lead the migration from ad-hoc SQL scripts to a Bronze/Silver/Gold medallion architecture with dbt as the transformation layer
Design and implement dimensional models ie fact tables and dimensions
Build the Silver staging layer
Architect the real-time CDC pipeline
Implement data contracts and governance at the Silver layer to insulate downstream consumers from source changes
Implement a hot/cold storage strategy via Redshift Spectrum
Build the Unified Access Layer
Design and automate Glue jobs
Configure S3 lifecycle policies for progressive cost reduction

Requirements:

5+ years in data engineering with production pipeline ownership (not just analytics or BI)
Deep AWS experience: Glue (both Python Shell and Spark ETL), Redshift, S3, IAM, EventBridge, Lambda, AppFlow
Strong SQL: complex joins, window functions, MERGE/UPSERT patterns, Redshift-specific optimization (sort keys, dist keys, VACUUM/ANALYZE)
Python fluency: boto3, data processing libraries, writing production Glue scripts (not just notebooks)
Dimensional modeling: star schemas, fact/dimension design, SCD Type 1 and Type 2 implementation
dbt: hands-on experience building and maintaining staging, intermediate, and mart models with tests and documentation
Data warehouse operations: schema migration, incremental loads, backfill strategies, monitoring, and alerting
Redshift Spectrum: experience with external schemas, Parquet/Hive partitioning, and unified hot/cold querying
CDC / streaming: Postgres WAL, Debezium, EventBridge, or similar change data capture pipelines
Data Mesh concepts: domain-oriented ownership, data-as-a-product thinking, federated governance
AppFlow & SaaS integrations: configuring and troubleshooting managed connectors for Stripe, Zendesk, Mixpanel, etc
Cost optimization: right-sizing Glue jobs (Python Shell vs. Spark), Redshift concurrency scaling, S3 lifecycle policies
Vendor distribution: building outbound API sync jobs with rate limiting, SFTP transfers, webhook delivery
Familiarity with marketplace or e-commerce data (orders, payments, attribution, promo codes)
Experience with Mixpanel, Customer.io, or Singular data exports and event schemas
Prior experience migrating from monolithic ETL to medallion or lakehouse architectures
Exposure to data governance tooling: data catalogs, lineage tracking, quality frameworks (e.g., Great Expectations, dbt tests)

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: