Switch is a company that designs, builds, and operates data centers, aiming to create the world’s most advanced digital infrastructure. The Senior Principal Reliability Engineer will serve as the senior technical authority for mechanical and electrical reliability, defining fleet-wide reliability strategies and leading maintenance programs to prevent unplanned downtime.

Responsibilities:

Define and own fleet-wide reliability strategies for critical power and cooling infrastructure
Establish engineering standards, redundancy models, and reliability frameworks across all campuses
Architect and mature predictive and condition-based maintenance programs
Implement and optimize RCM, FMEA, PM optimization, and asset criticality methodologies
Develop remaining useful life models and failure forecasting approaches to reduce unplanned outages
Define telemetry and sensor standards across BMS, EPMS, SCADA, and DCIM platforms
Partner with controls and analytics teams to develop high-fidelity data models for reliability monitoring
Lead failure investigations and convert root cause findings into engineering, maintenance, or operational improvements
Drive systemic risk reduction initiatives across the global data center fleet
Own asset lifecycle reliability from commissioning through end-of-life modeling
Maintain global maintenance standards, templates, and procedural governance
Mentor engineers and operations leaders on reliability methodology and analytical techniques
Influence OEMs, design engineering, and construction teams to embed reliability into future deployments

Requirements:

Bachelor's degree in Mechanical or Electrical Engineering required
12 or more years of experience in mission-critical or hyperscale data center environments
Deep expertise in critical electrical systems such as UPS, medium and low voltage gear, and switchgear
Deep expertise in mechanical systems such as CRAH or CRAC units, chillers, cooling towers, and pumping systems
Proven experience in RCM, FMEA, RCA, predictive and condition-based maintenance, and reliability analytics
Experience with controls systems including BMS, EPMS, SCADA, and telemetry-enabled monitoring
Experience applying statistical reliability modeling such as Weibull analysis or Monte Carlo simulation
Preferred experience with high-density cooling systems and advanced analytics or AI-enabled maintenance strategies

Senior Principal Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: