Respond to escalated incidents and engage vendors as necessary. Identify, track, and resolve associated problems
Author and maintain documentation and knowledge-based articles
Adhere to all security guidelines and ensure that systems are properly protected with working backups; ensure all others with access are doing the same
Perform general research to evaluate a range of product offerings and evaluate emerging trends
Identify and document technical application, architectures and requirements
Order, provision and configure supporting systems and infrastructure
Interact with vendors and provide feedback on their products
Keep training materials up-to-date
Requirements
Bachelor’s degree in related field, or equivalent professional experience
Four years of relevant experience with at least one year of experience working in a structured, large-scale operational environment
Experience with Change, Incident, and Problem management processes
Working experience of Unix or Windows servers
Working experience of network management principles
Working experience of Cloud technology principles
Working experience of DevSecOps principles
Knowledge of scripting
Exceptional oral and written communication skills
Participate in the system maintenance to ensure patching, upgrade, etc. to meet compliance expectations
This job operates in a professional office environment
To successfully perform the essential functions of the job there may be physical requirements which need to be met such as sitting for long periods of time and using computer monitors/equipment
Experience defining and operating SLIs/SLOs and error budgets using Dynatrace, with actionable alert integration into ServiceNow.
Proven production incident response and observability experience (metrics, logs, traces) within ITIL-aligned ServiceNow environments.
Strong automation and resiliency background, including Azure DevOps pipelines, Infrastructure as Code, toil reduction, and RTO/RPO-driven design.
Tech Stack
Azure
Cloud
ServiceNow
Unix
Benefits
Document the steps required to fulfill service requests
Plan and perform approved, scheduled and emergency maintenance. This includes documenting the maintenance and following the change management process, including approving documented change requests
Active involvement in ensuring system monitoring is sufficient for all systems
Participate in the design and implementation of supporting systems and infrastructure