Yahoo is an American web portal that provides various services including Yahoo Search and Yahoo Mail. They are seeking a motivated entry-level AI Engineer to join their AI & ML team in Champaign, Illinois, where the role involves designing and building scalable tools in data governance and orchestration, as well as integrating generative AI into production systems.
Responsibilities:
- Assist in building AI features: use Python, LLM APIs (OpenAI, Anthropic, etc.), vector embedding pipelines
- Support prompt engineering and RAG workflows: design, test, iterate prompt templates, integrate vector search
- Help build and maintain AI-model monitoring/observability dashboards: track model accuracy, latency, drift and work with backend engineers to integrate AI services into the product
- Participate in experimenting with AI workflows: multi-agent orchestration, model fine-tuning, system prompts
- Working through documents and conversations with colleagues to understand product requirements for new features
- Work closely with cross-functional teams to understand product and technical roadmaps, identifying potential impacts on system operability and proposing proactive solutions for Cloud environments
- Lead initiatives to enhance and optimize existing cloud infrastructure, drive improvements in scalability, efficiency, and resilience, and oversee large-scale projects related to cloud platforms, automation, and performance optimization
- Foster cross-functional collaboration between development, infrastructure, and operations teams to improve the overall performance, reliability, and security of services on cloud
Requirements:
- A solid Computer Science foundation in data structures and algorithms, object oriented programming, and modern software engineering practices from your achievement of obtaining a degree in CS or a similar engineering pursuit
- Proactive in staying updated with evolving AI trends and new LLM releases
- Skilled at diagnosing and solving complex, ambiguous problems with curiosity and a product-focused mindset
- Experience working with the latest Large Language Models (LLMs) and AI advancements, cloud native AI services like Sagemaker, VertexAI, LangChain, LlamaIndex, or other LLM-orchestration libraries
- The ability to use an object oriented programming language like Java or C++ or scripting languages like Python or Perl, and Unix or Linux systems
- Knowledge of SQL and distributed query engines (e.g., Presto, Trino, Athena, BigQuery). Familiarity with data concepts such as joins, aggregation, projection, and explosion
- The ability to work with large-scale distributed systems
- Strong analytical and problem-solving skills with the ability to work effectively in a cross-functional, collaborative environment
- Working knowledge of AWS and GCP cloud environments, including core data and compute services (e.g., EMR, MWAA, S3, Lambda, ECS, BigQuery, Dataproc)
- Experience with data pipeline orchestration tools and frameworks such as Oozie and Airflow
- Query Execution and Optimization: Designing and optimizing queries to run efficiently on platforms such as BigQuery, Hive, Pig, and Spark, ensuring high performance and scalability
- Familiarity with modern data architectures, including lakehouse and Medallion design patterns
- Understanding of data processing/data governance concepts
- Familiarity with AI-assisted engineering tools (e.g., Cursor, MCP, Copilot, agentic AI frameworks) and emerging AI/ML technologies that enhance data engineering productivity
- Experience working with IaC (eg. Terraform, Ansible)
- Experience working with Infrastructure as Code (IaC) tools, such as Terraform, or CloudFormation, to automate and manage cloud infrastructure deployments and automations
- Familiarity & working experience with Kubernetes and container-based orchestration