Prudent Technologies and Consulting, Inc. is seeking a Senior Site Reliability Engineer to focus on Compute Platform. The role involves designing and automating bare metal compute environments, implementing Bare Metal as a Service platforms, and supporting production-grade Kubernetes clusters.
Responsibilities:
- 6+ years of experience in infrastructure engineering, platform engineering, or DevOps with a strong focus on Compute system design
- Proven experience designing and automating bare metal compute environments at scale
- Strong hands-on experience with PXE boot, network-based OS provisioning, and automated server imaging
- Experience implementing or supporting Bare Metal as a Service (BMaaS) platforms
- Practical experience using Redfish APIs for hardware provisioning, power management, and remote lifecycle operations
- Deep expertise with Ubuntu Linux in enterprise environments
- Strong Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack)
- Experience designing and deploying production-grade Kubernetes clusters
- Strong background with enterprise compute hardware platforms, including Cisco UCS, Dell PowerEdge, Supermicro systems & HPE
- Proficiency with Infrastructure as Code tools (e.g., Terraform, Ansible, or similar)
- Experience building or supporting CI/CD pipelines for infrastructure and platform automation
- Strong scripting skills in Python, Bash, or similar languages
- Demonstrated ability to produce clear, structured technical design documentation
- Excellent written and verbal communication skills
Requirements:
- 6+ years of experience in infrastructure engineering, platform engineering, or DevOps with a strong focus on Compute system design
- Proven experience designing and automating bare metal compute environments at scale
- Strong hands-on experience with PXE boot, network-based OS provisioning, and automated server imaging
- Experience implementing or supporting Bare Metal as a Service (BMaaS) platforms
- Practical experience using Redfish APIs for hardware provisioning, power management, and remote lifecycle operations
- Deep expertise with Ubuntu Linux in enterprise environments
- Strong Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack)
- Experience designing and deploying production-grade Kubernetes clusters
- Strong background with enterprise compute hardware platforms, including Cisco UCS, Dell PowerEdge, Supermicro systems & HPE
- Proficiency with Infrastructure as Code tools (e.g., Terraform, Ansible, or similar)
- Experience building or supporting CI/CD pipelines for infrastructure and platform automation
- Strong scripting skills in Python, Bash, or similar languages
- Demonstrated ability to produce clear, structured technical design documentation
- Excellent written and verbal communication skills
- Bachelor's degree in computer science or equivalent professional experience