Job Title: Incident Manager
Location: Oakland CA (ONSITE 3 DAYS A WEEK)
CONTRACT Role
JOB DESCRIPTION :
- Manage incident management bridge calls with support teams, on-call support application teams and management. Manage, escalate, status, and assist, coordinating repair efforts for all major incidents (P1 P4).
- Regular communication updates to the Customer, End-Users and other Stakeholders during the entire Incident Management cycle
- Track and document incident updates in real time
- Since Major incidents are highly escalated cases, handling with presence of mind and innovation.
- Support the development and execution of change management plans to drive adoption and utilization of new processes, systems, and technologies.
- Reviewing changes, their priority, their urgency and performing risk analysis.
- Creating problem tickets and respective action items, reviewing root cause analysis and its closers.
- Performing PIR and Postmortem reports.
- Leading Site reliability/Disaster Recovery/Game Day/Switchover/Failover activities.
- Experience in handling multiple monitoring tools like Service now, Pager duty, Slack, Zoom, JIRA, etc.
Qualifications/Skills required.
- Degree in computer science, Information Technology, or related field.
- 7-10 years of experience in incident management or related field.
- Knowledge of Cloud services is must. ( AWS/Azure/Google Cloud Platform)
- Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across platform teams while avoiding common pitfalls.
- Should be able to plan and conduct site reliability testing
- Should have experience in AMS - Application Management Services.
- Knowledge of incident management/change management/problem management processes and procedures.
- Experience with and knowledge of change management principles, methodologies and tools