NBCUniversal is one of the world's leading media and entertainment companies, seeking a Data Engineer to join their Engineering & Operations team. The role involves designing, building, and operating scalable data pipelines and privacy-safe data integrations to support NBCUniversal’s data collaboration ecosystem.

Responsibilities:

Support partner onboarding into clean room environments across platforms such as Snowflake, LiveRamp, Databricks, or similar technologies
Follow clean room architecture patterns that are secure, scalable, privacy-preserving, and repeatable across partner engagements
Configure and manage clean room environments, including data access, environment setup, platform configuration, and release validation
Serve as the technical owner for assigned partner onboarding efforts, coordinating with product, engineering, operations, privacy, and partner-facing teams
Implement privacy-preserving controls such as aggregation thresholds, anonymization techniques, approved query patterns, and output validation checks
Deploy and manage Python-based libraries, templates, and reusable components within the clean room and data platform ecosystem
Support environment setup, configuration management, package deployment, and version-controlled release processes
Partner with software engineering teams to operationalize reusable libraries for audience, measurement, reporting, and partner-facing workflows
Ensure platform components are deployed consistently across partner environments and aligned with established engineering standards
Design, implement, and enforce granular role-based access control policies across data platform environments
Configure least-privilege service accounts, roles, grants, schemas, shares, and data access patterns
Partner with security, privacy, and platform teams to ensure access controls meet internal policies and partner-specific requirements
Validate that partner-facing outputs adhere to privacy, security, and business requirements before release
Design, build, and operate scalable ELT pipelines using advanced SQL, Snowpark, PySpark, dbt, or similar technologies
Develop and provision curated Gold datasets for audience, measurement, activation, and reporting use cases
Build reusable pipeline patterns that support batch and near real-time processing across Snowflake, Databricks, or similar platforms
Translate business and analytical requirements into reliable, well-documented, production-ready data products
Own pipeline performance, reliability, data correctness, and operational support for assigned data products
Implement and evolve identity resolution logic that maps internal NBCU data to third-party identifiers such as LUIDs, RampIDs, TransUnion IDs, or similar identity frameworks
Support privacy-safe identity workflows for audience matching, measurement, activation, and partner collaboration
Build validation checks to ensure identity mappings are accurate, secure, and compliant with approved usage patterns
Work with internal teams and external partners to troubleshoot match rates, data quality issues, and onboarding discrepancies
Build automated data quality checks using tools such as Great Expectations, dbt tests, custom SQL assertions, or similar frameworks
Define and monitor quality standards for schema drift, null rate spikes, volume anomalies, duplicate records, referential integrity, and unexpected data distribution changes
Create test strategies for partner-facing releases, including input validation, output validation, regression testing, and privacy checks
Document data assumptions, known limitations, validation logic, and operational support procedures
Optimize query performance and platform costs through query tuning, clustering/partitioning strategies, caching, incremental processing, and workload management
Implement query tagging, workload tracking, and chargeback/showback models to improve cost transparency and partner-level attribution
Establish monitoring, alerting, runbooks, and standard operating procedures to improve platform reliability and reduce incident time-to-resolution
Participate in incident response, root cause analysis, and continuous improvement efforts for production data workflows

Requirements:

Bachelor's degree or equivalent practical experience in Computer Science, Information Systems, Software Engineering, Electrical Engineering, Electronics Engineering, Data Engineering, or a related technical field
3+ years of experience in data engineering, including building and operating production data pipelines, data models, and data products
Deep proficiency in advanced SQL and Python for data processing, automation, pipeline development, validation, and operational support
2+ years of hands-on experience with cloud data platforms such as Snowflake, Databricks, or similar technologies
Experience building scalable ELT pipelines using tools such as Airflow, dbt, Snowpark, PySpark, or similar technologies
Exposure to data clean room concepts or platforms such as Snowflake Clean Rooms, Databricks Clean Rooms, LiveRamp, Habu, or similar technologies
Exposure to advertising technology, audience activation, campaign delivery, reach and frequency, attribution, incrementality, or reporting workflows
Experience working with identity graphs, hashed identifiers, RampIDs, LUIDs, TransUnion IDs, device IDs, household IDs, or similar identity frameworks
Snowflake SnowPro Core Certification, Databricks Certified Data Engineer Associate, or similar cloud/data platform certification

Data Engineer, Engineering & Operations

Key skills

About this role

Responsibilities:

Requirements: