Zeta Global is an AI-Powered Marketing Cloud that utilizes advanced artificial intelligence to enhance marketing efficiency. The Senior Data Engineer will design, build, and operate data pipelines to support Zeta’s AdTech platform, focusing on high-scale data processing and analytics-ready datasets.

Responsibilities:

Build data pipelines: Develop robust batch and streaming pipelines (Kafka/Kinesis) to ingest, transform, and enrich large-scale event data (impressions, clicks, conversions, costs, identity signals)
Create data aggregates & marts: Design and maintain curated aggregates and dimensional models for multiple consumers—prediction models, agents, BI dashboards, and measurement workflows
Data modeling & contracts: Define schemas, data contracts, and versioning strategies to keep downstream systems stable as sources evolve
Data quality & reliability: Implement validation, anomaly detection, backfills, and reconciliation to ensure completeness, correctness, and timeliness (SLAs/SLOs)
Performance & cost optimization: Optimize compute/storage for scale (partitioning, file sizing, incremental processing, indexing), balancing latency, throughput, and cost
Orchestration & automation: Build repeatable workflows with scheduling/orchestration (e.g., Airflow, Dagster, Step Functions) and CI/CD for data pipelines
Observability for data systems: Instrument pipelines with metrics, logs, lineage, and alerting to accelerate detection and root-cause analysis of data issues
Security & governance: Apply least-privilege access, PII-aware handling, and governance controls aligned with enterprise standards

Requirements:

5+ years building production data pipelines and data products (batch and/or streaming) in a high-scale environment
Strong experience with SQL and data modeling (dimensional modeling, star/snowflake schemas, event modeling)
Hands-on experience with streaming systems (Kafka preferred) and/or AWS Kinesis, including event-driven designs
Proficiency in one or more languages used for data engineering (Python, Java, Scala, or Go)
Experience with distributed data processing (Spark, Flink, or equivalent) and performance tuning at scale
Experience with AWS data services and cloud-native patterns (S3, Glue/EMR, Athena, Redshift, etc. as applicable)
Familiarity with lakehouse/table formats and large-scale storage patterns (e.g., Parquet; Iceberg/Hudi/Delta are a plus)
Experience with orchestration/workflow tooling (Airflow/Dagster/Step Functions) and CI/CD for data workloads
Strong data quality/observability practices (tests, monitoring, lineage; understanding of SLAs/SLOs)
Experience with SQL + NoSQL data stores (e.g., Postgres/MySQL; DynamoDB/Cassandra/Redis) and choosing the right store per use case
Clear communicator and collaborator; able to work with mixed audiences and translate needs into reliable data interfaces
AdTech / programmatic advertising domain knowledge: DSP/SSP/exchange/RTB concepts and data flows
Experience building measurement pipelines (attribution, incrementality, lift, or experimentation analytics)
Experience supporting ML feature stores, offline/online feature generation, or model training datasets
Experience with real-time analytics stores (Druid/ClickHouse/Pinot) and high-cardinality aggregation strategies
Deep knowledge of data governance/privacy, including PII handling and consent-aware data processing
Open-source contributions, publications, or conference speaking
BS/MS in CS/Engineering or equivalent practical experience

Senior Data Engineer

Key skills

About this role

Responsibilities:

Requirements: