Yahoo serves as a trusted guide for hundreds of millions of people globally, helping them achieve their goals online through a portfolio of iconic products. As a Senior Software Engineer, you will design and optimize the foundational storage layer powering a massive dataset, ensuring efficient data access and high availability for APIs serving millions of requests per second.
Responsibilities:
- Design and optimize Cloud Spanner schemas for efficient profile storage, query patterns, and write throughput at 2.5B+ profile scale
- Implement Valkey (Redis-compatible) caching strategies achieving sub-10ms read latency for hot data access patterns
- Build multi-region Spanner replication and automated failover mechanisms ensuring 99.99% availability and disaster recovery
- Optimize Spanner read/write throughput, reduce hot-spotting, and improve query performance through index design and query optimization
- Implement comprehensive monitoring and alerting systems tracking storage health, latency percentiles (p50, p95, p99), capacity utilization, and cost
- Collaborate with API team on efficient data access patterns, query optimization, and caching strategies for activation endpoints
- Partner with Ingestion team on high-throughput write patterns, batch loading strategies, and schema evolution without downtime
- Design backup, point-in-time recovery, and disaster recovery procedures for critical user profile data
- Troubleshoot production storage issues including performance degradation, hot-spotting, lock contention, and capacity constraints
- Work with SRE teams on capacity planning, autoscaling strategies, cost optimization, and infrastructure efficiency
- Implement cache invalidation strategies, cache warming, and distributed caching patterns for consistent data access
- Create comprehensive documentation for storage architecture, operational runbooks, disaster recovery procedures, and on-call playbooks
Requirements:
- Bachelor's degree in Computer Science, Engineering, or related technical field
- 5+ years software engineering experience building production systems
- 3+ years hands-on experience with distributed databases or large-scale storage systems
- 2+ years with GCP infrastructure (Spanner, Memorystore, Cloud Monitoring) or AWS equivalents (DynamoDB, ElastiCache)
- Strong proficiency in Java, Go, or Python for infrastructure and database tooling development
- Hands-on experience with Cloud Spanner, CockroachDB, TiDB, or other distributed SQL databases
- Experience with Redis, Valkey, Memcached, or other distributed caching systems in production
- Deep understanding of distributed systems: consistency models (strong vs. eventual), replication strategies, consensus algorithms (Paxos, Raft)
- SQL optimization skills and database schema design expertise including indexing strategies, partitioning, and query tuning
- Familiarity with database performance tuning: profiling slow queries, analyzing execution plans, optimizing hot-spotting
- Strong performance tuning and troubleshooting abilities in distributed database environments
- Demonstrated ability delivering reliable infrastructure solutions on schedule with minimal guidance
- Excellent collaboration with infrastructure, application, and SRE teams
- Team-level impact with ability to influence technical decisions within immediate team
- Understanding of data durability, consistency guarantees, and operational excellence
- Experience with multi-region Cloud Spanner deployments at petabyte scale
- Knowledge of cache invalidation strategies, cache coherence protocols, and distributed caching patterns
- Prior experience in large-scale user data platforms, identity systems, or adtech storage infrastructure
- Familiarity with database migration tools (gh-ost, pt-online-schema-change) and zero-downtime schema evolution
- Understanding of data partitioning strategies, sharding, horizontal scaling, and distributed transaction processing
- Experience with database backup and recovery tools, point-in-time recovery, and disaster recovery testing
- Contributions to database or distributed systems open-source projects (Spanner clients, Redis modules, CockroachDB)
- Self-driven, detail-oriented, excellent multitasking abilities in fast-paced environments