Yahoo is a leading technology company known for its email services, and they are seeking a Data Engineer II to join their Mail Analytics Infrastructure & Data Engineering team. This role involves building and maintaining large-scale data infrastructures and systems to support data processing, analytics, and AI capabilities.
Responsibilities:
- Develop new or improve and maintain existing large-scale data infrastructures and systems for data processing or serving, optimizing complex code through advanced algorithmic concepts and in-depth understanding of underlying data system stacks
- Create and contribute to frameworks that improve the efficacy of the management and deployment of data platforms and systems, while working with data infrastructure to triage and resolve issues
- Prototype new metrics or data systems
- Define and manage Service Level Agreements for all data sets in allocated areas of ownership
- Develop complex queries, very large volume data pipelines, and analytics applications to solve analytics and data engineering problems
- Collaborate with engineers, data scientists, and product managers to understand business problems, technical requirements to deliver data solutions
- Engineering consulting on large and complex data lakehouse data
Requirements:
- BS in Computer Science/Engineering, relevant technical field, or equivalent practical experience, with specialization in Data Engineering
- 3-5 years of experience in Data Engineering (ETL, data lakehouse, Data Modeling)
- Strong fundamentals: algorithms, distributed computing, data structure, database, data warehouse
- Fluency with: Python/Java/SQL
- Self-driven, challenge-loving, detail oriented, teamwork spirit, excellent communication skills, ability to multitask and manage expectations
- MS/PhD in Computer Science/Engineering or relevant technical field, with specialization in Data Engineering
- 2+ years experience in Hadoop/Apache technologies (Pig, Hive, HBase, Storm, Spark, Kafka, Oozie)
- 2+ years experience in Google Cloud Platform technologies (BigQuery, Dataproc, Dataflow, Composer, Looker)