GitHub is the world’s leading platform for agentic software development, and they are seeking a Senior Data Engineer to join their Revenue Data & Analytics team. In this role, you will build and maintain the governed data infrastructure that powers revenue understanding, focusing on data engineering, data modeling, and analytics engineering.

Responsibilities:

Design, build, and maintain dbt models across medallion layers (bronze/silver/gold) in Microsoft Fabric Lakehouse and Warehouse, following Kimball dimensional modeling patterns — including SCD2 dimensions, incremental CDC pipelines, and metadata-driven approaches to minimize code duplication
Author and enforce data quality checks and dbt tests across pipeline stages to catch anomalies before they reach downstream consumers; contribute to data cataloging and lineage to ensure governed datasets are discoverable and traceable
Develop and maintain Airflow DAGs for orchestration — scheduling, dependency management, error handling, and alerting
Containerize data workloads with Docker and deploy via GitHub Actions CI/CD pipelines, including automated testing, linting, and environment promotion (dev → staging → prod)
Manage and optimize ADLS Gen2 and Delta Lake storage — partitioning, compaction, retention policies, and cost management
Collaborate with analytics engineers, BI developers, and analysts to ensure gold-layer datasets serve Power BI, Trino, and downstream reporting needs
Participate in architecture reviews and contribute to ADRs; support migration from legacy patterns toward a governed, metadata-driven platform with pragmatism about transition paths
Own operational excellence across data pipelines — monitoring, alerting, incident response, and proactive detection of data drift, schema changes, and quality regressions

Requirements:

6+ years experience in Software Engineering, Computer Science, or related technical discipline with proven experience maintaining and delivering production software coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Go, Ruby, Rust, or Python
OR Associate's Degree in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field AND 5+ years experience in Software Engineering, Computer Science, or related technical discipline with proven experience maintaining and delivering production software coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Go, Ruby, Rust, or Python
OR Bachelor's Degree in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field AND 4+ years experience in Software Engineering, Computer Science, or related technical discipline with proven experience maintaining and delivering production software coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Go, Ruby, Rust, or Python
OR Master's Degree in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field AND 2+ years experience in Software Engineering, Computer Science, or related technical discipline with proven experience maintaining and delivering production software coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Go, Ruby, Rust, or Python
OR Doctorate in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field
OR equivalent experience
5+ years SQL experience
SQL fluency — window functions, CTEs, merge statements, query optimization; you think in sets, not loops
Hands-on dbt experience (Core or Cloud) — models, tests, macros, Jinja, incremental materializations; dimensional modeling (Kimball star schemas, SCD2, conformed dimensions) a strong plus
Orchestration experience (Airflow, Prefect, Dagster, or similar) for scheduling, dependencies, and error handling
Cloud data platform experience — Azure preferred (Fabric, ADLS, Synapse); AWS/GCP transfers; familiarity with Delta Lake, Apache Iceberg, or Spark a bonus
Docker, Git-based workflows, and CI/CD for data pipelines; Python or equivalent for engineering tasks
Data quality tooling (Soda, dbt Elementary) and catalog/lineage tools (Purview, Atlan, DataHub, or similar)
Familiarity with advanced patterns — medallion architecture, Data Vault 2.0, metadata-driven frameworks, or federated query engines (Trino/Presto)
Experience with revenue, finance, or billing data — ARR, consumption models, hierarchy attribution, and account ownership complexity

Senior Data Engineer, Revenue Intelligence

Key skills

About this role

Responsibilities:

Requirements: