Photon is a company that has powered many Digital Experiences for the Fortune 500 for the past 20 years. They are seeking a Senior Data Engineer to work on large-scale batch pipelines and data systems, enabling critical decision-making and AI-powered capabilities.

Responsibilities:

Partner with Data Science, Product, and Engineering to collect requirements to define the data ontology for Mail Data & Analytics
Lead and mentor junior Data Engineers to support Yahoo Mail’s ever-evolving data needs
Design, build, and maintain efficient and reliable batch data pipelines to populate core data sets
Develop scalable frameworks and tooling to automate analytics workflows and streamline users interactions with data products
Establish and promote standard methodologies for data operations and lifecycle management
Develop new or improve and maintain existing large-scale data infrastructures and systems for data processing or serving, optimizing complex code through advanced algorithmic concepts and in-depth understanding of underlying data system stacks
Create and contribute to frameworks that improve the efficacy of the management and deployment of data platforms and systems, while working with data infrastructure to triage and resolve issues
Prototype new metrics or data systems
Define and manage Service Level Agreements for all data sets in allocated areas of ownership
Develop complex queries, very large volume data pipelines, and analytics applications to solve analytics and data engineering problems
Collaborate with engineers, data scientists, and product managers to understand business problems, technical requirements to deliver data solutions
Engineering consulting on large and complex data lakehouse data

Requirements:

BS in Computer Science/Engineering, relevant technical field, or equivalent practical experience, with specialization in Data Engineering
8+ years of experience building scalable ETL pipelines on industry standard ETL orchestration tools (Airflow, Composer, Oozie) with deep expertise in SQL, PySpark, or scala
3+ years leading data engineering development directly with business or data science partners
Built, scaled, and maintained Multi-Terabyte data sets and having an expansive toolbox for debugging and unblocking large scale analytics challenges (skew mitigation, sampling strategies, accumulation patterns, data sketches, etc.)
Experience with at least one major cloud's suite of offerings (AWS, GCP, Azure)
Developed or enhanced ETL orchestrations tools or frameworks
Worked within standard GitOps workflow (branch and merge, PRs, CI / CD systems)
Experience working with GDPR
Self-driven, challenge-loving, detail oriented, teamwork spirit, excellent communication skills, ability to multitask and manage expectations
MS/PhD in Computer Science/Engineering or relevant technical field, with specialization in Data Engineering
3 years experience in Google Cloud Platform technologies (BiqQuery, Dataproc, Dataflow, Composer, Looker)

Senior Data Engineer - Dallas/California, United States

Key skills

About this role

Responsibilities:

Requirements: