NBCUniversal is one of the world's leading media and entertainment companies, and they are seeking a Senior Data Engineer to lead the Engineering & Automation pillar. The role involves designing and implementing scalable data architectures and driving automation through AI agents and self-service tooling, ensuring operational excellence in data collaboration ecosystems.
Responsibilities:
- Architecture Leadership: Define partner onboarding and clean room architecture patterns across Snowflake, LiveRamp, and Databricks that are secure, scalable, and repeatable
- Infrastructure Setup & Library Deployment: Configure and manage partner-specific clean room environments; deploy and manage Python-based libraries within the platform ecosystem
- MLOps Integration: Establish and maintain MLOps practices, including model serving, monitoring, and pipeline orchestration for AI/ML features deployed within the platform ecosystem
- Security & RBAC: Own design and enforcement of granular RBAC policies and least-privilege service accounts
- Partner Onboarding: Serve as the technical lead for onboarding new partners, implementing privacy-preserving controls (e.g., aggregation thresholds and anonymization techniques)
- Pipeline & Data Product Ownership: Design, build, and operate scalable ELT pipelines using Snowpark and/or PySpark and advanced SQL to provision Gold datasets
- Identity Resolution: Implement and evolve identity resolution logic mapping internal data to 3P identifiers (including LUIDs, RampIDs, TransUnion IDs), ensuring privacy-safe practices
- Scalable Architecture: Design and operate scalable data architectures across Snowflake and Databricks supporting batch and near real-time processing patterns
- Data Quality by Design: Build robust automated checks (e.g., Great Expectations or custom SQL assertions) and define quality standards to detect schema drift, null rate spikes, and volume anomalies
- FinOps & Operational Excellence: Lead performance optimization across platforms (query tuning, caching, incremental processing) and define and implement query tagging and chargeback models for accurate cost attribution
- Operational Maturity: Establish monitoring, alerting, runbooks, and standard operating procedures to improve platform reliability and reduce incident time-to-resolution
- Testing & Validation: Validate that output data adheres to privacy and business requirements, and define test strategies for partner-facing releases
- Technical Escalation: Serve as the escalation point for diagnosing connection failures, data discrepancies, or latency issues with partner technical teams
- Agentic Workflows & Platform Enablement: Design and build internal AI agents (using frameworks like LangChain, Snowflake Cortex) and mentor other engineers through code reviews, design discussions, and operational best practices
Requirements:
- Bachelor's degree or higher in Computer Science, Information Systems, Software, Electrical or Electronics Engineering
- 5+ years of Data Engineering experience, with deep proficiency in advanced SQL and Python
- 3+ years of hands-on experience with cloud data platforms, specifically Snowflake or Databricks
- Proven experience building and operating scalable ELT pipelines using orchestration tools (e.g., Airflow, dbt)
- Strong track record designing production-grade systems (observability, reliability, performance tuning, incident response)
- Exposure to Data Clean Room concepts and Clean Room platforms like LiveRamp, Snowflake or Databricks
- Experience building applications with LLMs, RAG, Vector Databases, or frameworks like LangChain/LlamaIndex
- Ability to mentor other engineers through code reviews, design discussions, and operational best practices
- SnowPro Core Certification OR Databricks Certified Data Engineer Associate
- SnowPro Advanced: Data Engineer OR Databricks Certified Data Engineer Professional