Wavelo is a SaaS business focused on modernizing the operations of communication service providers. They are seeking a Senior Database Engineer to design and optimize the data persistence layer for large-scale systems, ensuring performance and reliability across the database environment.
Responsibilities:
- Design, implement, and operate highly available PostgreSQL clusters (physical/logical replication, sharding, partitioning, failover automation)
- Optimize query performance and indexing strategies
- Perform capacity planning, growth forecasting, and workload modeling
- Own high-availability strategies, including:
- Automatic failover
- Multi-region deployments
- Disaster recovery
- Build and maintain automation for:
- Provisioning and configuration
- Backups and recovery
- Failovers
- Vacuum tuning
- Schema management
- Use tools such as Terraform, Ansible/SaltStack, Bash, Python, etc
- Develop monitoring and alerting systems for PostgreSQL clusters
- Lead response during database incidents (e.g., performance regressions, replication lag, deadlocks, bloat, storage failures)
- Conduct root-cause analysis and implement long-term fixes
- Partner with software engineers to:
- Review SQL queries
- Optimize schemas
- Ensure effective use of PostgreSQL features
- Provide guidance on:
- Database design patterns
- Migrations and version upgrades
- Best practices
Requirements:
- 7+ years of hands-on PostgreSQL experience in large-scale, high-volume production environments
- Strong expertise in PostgreSQL internals: WAL, MVCC, vacuum tuning, query planner, indexing, logical replication
- Advanced SQL and strong schema design and query optimization skills
- Solid experience with Linux systems and networking fundamentals
- Experience building automation using Go or Python
- Experience with monitoring tools such as: Prometheus, Grafana, Datadog, PMM, pg_stat_statements
- Experience with connection pooling and load balancing: PgBouncer, HAProxy
- Experience with high-availability solutions: Patroni or similar tools
- Exposure to event streaming and CDC: Kafka, Debezium
- Experience supporting 24/7 production environments
- Experience with PostgreSQL backup tools: Barman, pgBackRest, WAL-G
- Familiarity with Traefik or similar infrastructure components