Pegasystems is a leading technology company focused on improving cloud service offerings. The Senior Cloud Operations Engineer will be responsible for ensuring the reliability and availability of the Pega Cloud Platform, managing environments, troubleshooting issues, and collaborating with teams to enhance operational excellence.
Responsibilities:
- Perform provisioning of new environments and upgrade of the infrastructure components & Product application
- Perform decommission of existing environments
- Troubleshoot and resolve customers environment issues along with root cause analysis
- Create and maintain operational runbooks
- Identify and document Standard operating procedures for daily tasks
- Participate in testing of pre-release product enhancement testing with Engineering
- Identify opportunities for automation of repeated tasks and reduce toil
- Write scripts to automate repetitive tasks
- Work with team on scheduling upgrade tasks / hotfixes and patches
- Manage / execute deployment of system updates / patches and hotfixes
- Monitor the teams ticket queue and work with team to distribute tickets in timely manner
- Monitor teams email distribution list for escalation / communication and work with team to respond in timely manner
- Prepare handoff documentation to work with other global teams
- Willing to be on-call to support customers 24 x 7 on rotational basis
- Flexibility to work a Tuesday-Saturday shift
Requirements:
- US Citizenship or US Permanent Residency is required
- Proven professional and technical experience in an enterprise cloud environment supporting SAAS applications with a focus on operational delivery excellence and customer service
- Self-motivated, inquisitive, and creative, with a passion for continuous improvement and excellent people skills
- Works well with cross-functional global and remote teams
- Demonstrated ability to learn new technologies, techniques, and tools quickly to meet business requirements
- Comfortable working in a fast-paced, enterprise environment
- Possess customer obsession and proven empathy towards customers
- 5+ years of hands-on operational or engineering experience in installing, configuring, troubleshooting, and tuning Java applications and Apache Tomcat application servers
- 5+ years of experience with enterprise scale Linux Administration
- Hands-on operational experience with Amazon Web Services (AWS) and/or Google Cloud Platform (GCP)
- Deep understanding of cloud-based infrastructure, platform, and application operational administration - including product and platform upgrades, installations, backup, and recovery, monitoring and observability, etc
- Administration of web servers running Tomcat, Apache, IIS, Nginx
- Bachelor's degree in Computer Science/Engineering or equivalent
- Experience with microservices architecture with Kubernetes is a plus
- Basic network troubleshooting skills including TCP/IP, DNS, VPN is a plus
- Experience in Bash/Shell, Python, or similar scripting languages to automate common tasks, a plus
- AWS / GCP Certification, a plus
- Certified Kubernetes Administrator, a plus