Prudent Technologies and Consulting, Inc. is seeking a Senior Site Reliability Engineer to focus on Compute Platform. The role involves designing and automating bare metal compute environments, implementing Bare Metal as a Service platforms, and supporting production-grade Kubernetes clusters.

Responsibilities:

6+ years of experience in infrastructure engineering, platform engineering, or DevOps with a strong focus on Compute system design
Proven experience designing and automating bare metal compute environments at scale
Strong hands-on experience with PXE boot, network-based OS provisioning, and automated server imaging
Experience implementing or supporting Bare Metal as a Service (BMaaS) platforms
Practical experience using Redfish APIs for hardware provisioning, power management, and remote lifecycle operations
Deep expertise with Ubuntu Linux in enterprise environments
Strong Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack)
Experience designing and deploying production-grade Kubernetes clusters
Strong background with enterprise compute hardware platforms, including Cisco UCS, Dell PowerEdge, Supermicro systems & HPE
Proficiency with Infrastructure as Code tools (e.g., Terraform, Ansible, or similar)
Experience building or supporting CI/CD pipelines for infrastructure and platform automation
Strong scripting skills in Python, Bash, or similar languages
Demonstrated ability to produce clear, structured technical design documentation
Excellent written and verbal communication skills

Requirements:

6+ years of experience in infrastructure engineering, platform engineering, or DevOps with a strong focus on Compute system design
Proven experience designing and automating bare metal compute environments at scale
Strong hands-on experience with PXE boot, network-based OS provisioning, and automated server imaging
Experience implementing or supporting Bare Metal as a Service (BMaaS) platforms
Practical experience using Redfish APIs for hardware provisioning, power management, and remote lifecycle operations
Deep expertise with Ubuntu Linux in enterprise environments
Strong Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack)
Experience designing and deploying production-grade Kubernetes clusters
Strong background with enterprise compute hardware platforms, including Cisco UCS, Dell PowerEdge, Supermicro systems & HPE
Proficiency with Infrastructure as Code tools (e.g., Terraform, Ansible, or similar)
Experience building or supporting CI/CD pipelines for infrastructure and platform automation
Strong scripting skills in Python, Bash, or similar languages
Demonstrated ability to produce clear, structured technical design documentation
Excellent written and verbal communication skills
Bachelor's degree in computer science or equivalent professional experience

Sr. Site Reliability Engineer (Compute Platform) _Remote

Key skills

About this role

Responsibilities:

Requirements: