Backblaze is the object storage leader in the open cloud movement, and they are seeking a Site Reliability Engineer I to help support the stability, health, and day-to-day operations of their infrastructure. The role involves responding to customer-affecting issues, managing incident resolution, and collaborating with cross-functional teams to ensure operational readiness.
Responsibilities:
- Act as first point of contact for all customer affecting issues
- Be a Key Driver for managing the resolution of technical problems
- Ensure that incident management processes are following and that incident post-mortems are completed to capture process deviations and areas for improvement
- Deliver consistent communication to Management
- Respond to zabbix alerts/regular monitoring of zabbix, either by taking direct action on alerts or escalating. Acknowledge every alert if direct action taken, or with escalation point of contact
- Make sure escalations are handed off successfully
- Ensure health of pods across all sites (define pod alerts on zabbix)
- Work through daily filesystem checks for pods
- Troubleshoot technical issues for DC Techs -> advanced pod questions, deployment questions, migration troubleshooting, and ansible playbook issues
- Identification and escalating any potential issues regarding the network
- Vault pre-deployment configuration and testing
- Start Vault Migrations, monitor migration pods, handle applicable migration pod health checks
- Document/Work on automating Daily Items
- Document/Provide Network IP's for upcoming deployments
- Monitor Releases/Updates to the Server Farm, escalate issues as they arise
- Engaging in on-call rotation shifts
- Assist fellow TechOps team members in handling tasks
- Making recommendations for improvements in organizational productivity
- Be able to work outside of normal business hours(weekend shift, holidays & evenings) as needed
Requirements:
- Must be located in Bangalore
- 2 - 4 years of relevant experience
- Knowledge of Sysadmin and Linux skills
- Desire to learn and develop all necessary technical skills
- Strong analytical thinking
- Strong skills in working with different teams and communication
- Knowledge of network cabling, network classification, and network topology