Pantheon is a company that powers over 300,000 websites globally for various organizations. They are seeking a Principal Engineer to lead the Hosting Platform team, responsible for the infrastructure supporting a vast number of websites and page views, focusing on reliability, scalability, and performance.

Responsibilities:

Define and drive the technical roadmap for platform reliability, scalability, and performance across hundreds of thousands of sites and billions of monthly requests
Architect, build, and evolve the core services underpinning Pantheon’s hosting platform — including edge delivery, container orchestration, database services, and site lifecycle management
Partner with product, security, and infrastructure teams to identify root causes and design iterative, high-impact solutions
Balance immediate business needs against long-term architectural health, making principled trade-offs that keep the platform sustainable as it scales
Provide technical leadership across the engineering organization — setting direction, reviewing designs, and raising the bar for quality, reliability, and operability
Mentor and coach engineers at all levels, providing technical guidance and growing engineering talent across the broader organization
Contribute to an engineering-wide culture of collaboration, blameless postmortems, and continuous improvement

Requirements:

10+ years of software development experience, with significant tenure building platform or infrastructure products
5+ years of experience designing and architecting large-scale distributed systems
Proficiency with container orchestration (Kubernetes), web server technologies (NGINX, PHP, Node.js, or similar), and infrastructure-as-code practices
Demonstrated ability to own and improve a production platform serving millions of concurrent users or requests
Proficiency in Go, Python, or equivalent systems-oriented languages
Deep expertise in distributed systems design, large-scale service architecture, and cloud-native infrastructure (Pantheon runs on Google Cloud)
Experience building or operating CDN, edge delivery, or networking-layer systems at scale — including caching strategies, cache invalidation, and edge performance optimization
Strong understanding of multi-tenant hosting platforms — including the SLO definition, observability, and incident response required to operate them at scale for hundreds of thousands of customer sites
Experience operating database systems as managed services — relational and non-relational — with appreciation for the operational complexity involved
AI-native engineering practices — including fluency with AI coding assistants (GitHub Copilot, Cursor, Claude Code) and a track record of integrating AI tools into engineering workflows, automation, and architectural decision-making
Experience building agentic and LLM-powered systems — including task orchestration, prompt engineering, and RAG — with the ability to prototype and ship AI features to production
Awareness of the infrastructure requirements of AI workloads — including model serving, GPU/accelerator provisioning, inference latency optimization, and cost trade-offs at scale

Principal Software Engineer - Hosting

Key skills

About this role

Responsibilities:

Requirements: