Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. They are seeking a Senior Cloud Operations Engineer to maintain and improve their Kubernetes platform, ensuring high availability, security, and performance while collaborating with various teams to automate operations.
Responsibilities:
- Maintain and improve our Kubernetes platform, ensuring high availability and scalability
- Implement infrastructure/configuration as code to automate operations. (Terraform, Ansible, Helm, Flux, Kustomize)
- Enhance observability and logging using OpenTelemetry and Elastic
- Building automated solutions that enable resiliency and self-healing of applications
- Managing Server Operating Systems (Windows and Linux)
- Managing Web Servers (IIS 10)
- Troubleshoot production incidents, perform root cause analysis, and drive reliability improvements
- Evaluate and implement cloud-native technologies to enhance platform efficiency
- Collaborate with security teams to ensure best practices for container security and compliance
- Work with multi-cluster management solutions such as Rancher, Cluster API (CAPI), or other Kubernetes fleet management tools
- Manage Kubernetes infrastructure on Azure and vSphere
- Participate in an on-call rotation to support platform operations and respond to incidents
Requirements:
- Bachelor's Degree or equivalent years of relevant work experience
- Legal authorization to work in the U.S. We will not sponsor individuals for employment visas, now or in the future, for this job opening
- Typically requires 5+ years of relevant professional experience in a cloud infrastructure, platform engineering, or operations role
- 3+ years working with Kubernetes in a production environment
- Proficiency with Terraform and Ansible
- Load balancer experience (F5 LTM, Azure Load Balancer)
- Public Cloud experience (Microsoft Azure or Amazon Web Services)
- Experience with Linux administration and container runtimes (Docker, containerd)
- Familiarity with observability tools (OpenTelemetry, Elastic, PRTG, and Dynatrace)
- Experience managing multi-cluster Kubernetes environments. (Rancher & Cluster API)
- Solid understanding of RBAC, security policies, and secrets management in Kubernetes
- Hands-on experience with Azure and vSphere as Kubernetes infrastructure providers
- Capability to analyze packet captures using tools such as Wireshark
- Strong understanding of IPv4/IPv6, FTP, HTTP, SSL/TLS, HTML, XML
- Knowledge of .Net website functionality
- The ability to participate in an on-call rotation for platform support
- Prior experience in SRE or Platform Engineering roles
- Degree in Computer Science or related area