Equinix is the world’s digital infrastructure company®, enabling innovations that enrich our work, life, and planet. The Principal Platform Engineer will lead the technical vision for observability and reliability standards across Equinix's global hybrid infrastructure, ensuring secure and scalable delivery of digital products.

Responsibilities:

Interacts with internal product management and engineering teams to understand product requirements and define the platform roadmap
Works with the Equinix Engineering Excellence (E3) team in the Equinix IT organization to find common points of acceleration and bidirectional consumption of services
Acts as a lead representative for Infrastructure P&S requirements in forums for enterprise-wide developer initiatives, plans, and architectures
Defines the platform reliability standards through the development of a comprehensive SLO/SLI framework
Drives architectural consistency for observability across a hybrid footprint including 31 metros and multiple AWS regions
Consolidates all application observability signals onto a single platform (Grafana Cloud) to provide a single source of truth
Provides technical leadership for the design of the "Paved Path" regarding application assurance and reliability signals
Evaluates and recommends the consolidation of disparate, non-unified observability tools and parallel support systems in favor of unified, strategic solutions
Designs integration strategies for identity and access management to ensure secure developer access to platform tools
Participates in the development of automated reliability signals and self-service observability tools
Drives project work and creates automation for the observability stack and application lifecycle tools
Participates in peer reviews and technical integration efforts to ensure cross-functional alignment within the PTD and CPS organizations
Sets standards for application assurance, including vulnerability management and identity integration programs
Recommends frameworks for measuring platform performance, such as Kubernetes API server uptime and provisioning delivery time
Articulates the vision for a unified runtime that leverages both global on-premises footprints and cloud capabilities
Leads the Observability Stack Unification charter as part of the broader CI/CD and platform consolidation effort
Utilizes FinOps and financial observability reporting to provide cost attribution by product, team, and organization
Defines and publishes critical reliability metrics, including Mean Time to Detect (MTTD) and Mean Time to Repair (MTTR)
Provides L4 technical escalation capacity to stabilize critical, high-toil services
Participates in on-call rotations for respective observability and operations areas to ensure 24/7 platform stability
Serves as a technical liaison for internal product teams (the platform's customers) to understand concerns and priorities
Acts as a primary point of contact for technical perspectives and alignment with stakeholders in the Equinix product organization and the Equinix IT organization
Works with Engineering Managers to define platform KPIs and project schedules for unification efforts
Provides status reporting on the Observability Standard and other strategic consolidation projects
Investigates and evaluates new observability technologies to reduce infrastructure toil for product teams
Influences the organization’s technical objectives by identifying fruitful opportunities in areas like telemetry and proactive alerting

Requirements:

10+ years in Platform Engineering, Site Reliability Engineering (SRE), or Observability-focused roles
Bachelor's in Computer Science, Computer Engineering, or a related technical field
Expert-level knowledge of Platform Engineering, Grafana Cloud, Observability concepts (Logs, Metrics, Traces, RUM, Synthetics, etc), and Operational Readiness
Competence with Kubernetes, ArgoCD, on-premises and cloud infrastructure (AWS), software engineering practices including CI/CD
Familiarity with Go development, cluster-api and the CNCF ecosystem

Principal Platform Engineer

Key skills

About this role

Responsibilities:

Requirements: