Role Overview

Own the path from raw transactional and event data to trustworthy, well-modeled datasets powering BetMGM's analytics, ML, and operational systems.
Design, build, and operate batch, micro-batch, and streaming pipelines feeding Snowflake — Prefect-orchestrated flows on ECS Fargate, dbt for transformation, Snowpipe Streaming and Kafka for event ingestion.
Own the full dbt lifecycle (sources → staging → intermediate → marts) with model contracts, freshness SLAs, automated tests, and version-controlled documentation.
Stand up Snowflake objects (warehouses, RBAC, resource monitors, Dynamic Tables, Iceberg tables) through Terraform — no ClickOps in production.
Build AWS-native infrastructure for data workloads — S3, ECS Fargate, Lambda, EMR Serverless, Glue Catalog, IAM, Secrets Manager, VPC endpoints — entirely in Terraform.
Maintain CI/CD pipelines (GitLab CI or GitHub Actions) that gate every change with linting, dbt build, unit tests, contract checks, and AI-assisted code review.
Tune warehouse sizing, clustering, and query patterns for cost and latency; instrument credit usage via ACCOUNT_USAGE; right-size before scaling up.
Design RBAC, masking policies, and row-access policies that satisfy a regulated operator without becoming an access bottleneck.
Own freshness SLAs and data contracts for the gold layer; triage incidents end-to-end.
Direct AI coding agents as a force multiplier — writing specs, decomposing work, reviewing AI-generated PRs, and owning the architectural decisions agents cannot make.
Partner with analytics engineers, data scientists, and ML platform engineers on shared standards (naming, testing, observability, lineage, cost attribution).

Requirements

BS or MS in Computer Science, Statistics, Math, or other STEM field — or equivalent practical experience.
5+ years building production data pipelines on a modern stack (Python + SQL + dbt + cloud).
Deep Snowflake — beyond SQL into administration: warehouse sizing, RBAC, resource monitors, Streams/Tasks, Dynamic Tables, secure data sharing, cost tuning via ACCOUNT_USAGE.
Strong AWS — S3, ECS/Fargate, Lambda, IAM, Secrets Manager, VPC — plus production experience with at least one of EMR Serverless, Glue, or MWAA.
Terraform for both cloud and Snowflake — you have owned IaC, not just touched it.
Orchestration fluency — Prefect, Airflow, or Dagster — and an opinion about when each is the right tool.
CI/CD ownership — you have built quality gates that block bad code, not just YAML pipelines that pass.
Bias toward outcomes — you describe past work in terms of SLAs, incidents, and customers served, not tool checklists.
Nice-to-Haves: Snowflake-native ML (Snowpark, Cortex AISQL, Snowflake Notebooks) for in-warehouse scoring or unstructured workloads.
Iceberg / open-table-format experience for cross-engine interoperability.
Streaming experience — Kafka, Snowpipe Streaming, or Kinesis — with stated latency budgets.
Reverse-ETL exposure (Hightouch, Census, or custom) into operational marketing or product systems.
A demonstrable track record of shipping more with AI in the loop than without — not 'I have used Cursor,' but 'this is how I design work for an agent to do.'

Tech Stack

Airflow
AWS
Cloud
ETL
Kafka
Python
SQL
Terraform

Benefits

Medical, Dental, Vision, Life, and Disability Insurance
401(k) with company match
Pre-tax spending accounts including health care FSA and commuter savings
Flexible paid time off
Professional development reimbursement and ongoing skills training opportunities
Employee resource groups
Swag, ticket giveaways, and more!

Senior Data Engineer

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits