Render is building a modern cloud platform for developers creating AI-native applications. The role involves designing and building cloud infrastructure to support rapid scaling and performance improvements, ensuring a secure and reliable platform for users.
Responsibilities:
- Own Render's core infrastructure across multiple data centers and regions
- Help offer unique capabilities to Render customers through infrastructure innovation
- Plan and architect for rapidly increasing scale
- Debug issues at all levels in our infrastructure stack
- Improve the performance and reliability of our infrastructure through increased observability, load testing, and chaos engineering
- Collaborate with other engineers to help keep our platform stable, predictable, and secure
- Participate in our on-call rotation, with the rest of the engineering team
Requirements:
- At least 5 years of experience building and scaling cloud infrastructure
- Experience developing, maintaining, and debugging production systems at scale
- Experience building, operating and scaling Kubernetes clusters or similar resource/container orchestration
- Experience with infrastructure-as-code tools like Terraform, Pulumi, and Ansible
- Experience with Linux kernel and/or container optimization
- Familiarity with observability tools like Datadog, Grafana, and OpenTelemetry
- Experience hosting PostgreSQL (or similar data stores) at scale
- Security hardening skills, especially in the context of untrusted workloads