Yahoo is a global company helping millions achieve their online goals through iconic products. They are seeking a Senior Data Engineer to design, build, and optimize scalable data pipelines and infrastructure that support AI initiatives, collaborating closely with software engineers and business stakeholders.
Responsibilities:
- Design, build, and maintain scalable data pipelines and ETL processes to support machine learning and AI initiatives on Google Cloud Platform (GCP)
- Implement and optimize data storage solutions using GCP services such as BigQuery, Cloud Storage, and Dataflow
- Ensure data quality, integrity, and security throughout the data lifecycle
- Collaborate with analysts and business stakeholders to understand data requirements and deliver actionable insights
- Monitor, troubleshoot, and maintain the health and performance of cloud-based data infrastructure
- Automate manual processes and repetitive tasks to improve efficiency and reduce errors
- Apply data governance and compliance best practices to protect sensitive information and meet regulatory standards
- Stay current with new GCP features, tools, and best practices to continuously enhance data management capabilities
- Document solutions, processes, and architectural decisions to facilitate knowledge sharing and maintainability
Requirements:
- BS or MS in Computer Science or a related major, or equivalent experience
- 7+ years of software engineering experience, with a strong emphasis on system design and backend development
- 2+ years hands-on experience with Google Cloud Platform ecosystem (BigQuery, Dataproc, Composer, Dataflow, Data Catalog, Observability) or AWS equivalent
- Proven ability to design, build, and maintain data pipelines that support machine learning and AI model development, training, and deployment
- Fluency with at least one object-oriented programming language from Java, Python, or Scala is highly desirable, as these skills are critical for developing robust applications and managing data workflows effectively. SQL proficiency is also valued for database operations
- Familiarity with data security, compliance, and governance best practices
- Strong problem-solving skills, attention to detail, and ability to work collaboratively with cross-functional teams
- Excellent communication skills and ability to tell insightful stories using data and also manage communication within internal teams and stakeholders
- Exposure to AI-assisted development tools such as Claude, GitHub Copilot, Cursor, or similar is highly desirable
- Experience with Google Analytics 360 is a plus