Upstart is a leading AI lending marketplace focused on reducing the cost and complexity of borrowing for Americans. As a Senior DevOps Engineer, you will design and operate Kubernetes clusters, evolve AWS infrastructure, and improve platform reliability and developer experience across the cloud platform team.
Responsibilities:
- Design and operate a fleet of Kubernetes (EKS) clusters across production, staging, and ephemeral environments, ensuring reliability and high availability
- Evolve AWS infrastructure and network architecture (VPCs, subnets, IAM, account structure) to support scalable, multi-team workloads
- Build and maintain infrastructure-as-code and GitOps workflows using tools such as Terraform, CDK, and ArgoCD
- Improve platform reliability and performance by defining and driving SLOs, analyzing incidents, and implementing systemic fixes
- Participate in and help improve the on-call rotation, leading incident response and post-incident reviews to drive systemic platform improvements
- Partner with SRE, Delivery, InfoSec, and product/ML teams to land high-impact infrastructure changes and platform standards
- Drive improvements in developer experience by simplifying platform usage, reducing toil, and enabling faster product and ML development
- Contribute to cost efficiency initiatives by optimizing resource utilization across Kubernetes and cloud infrastructure
Requirements:
- Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field (or equivalent practical experience) plus 4+ years of experience
- Experience operating Kubernetes in production environments, including cluster networking, storage, and RBAC
- Proficiency with AWS infrastructure, including VPC design, networking, and IAM
- Proven expertise in implementing infrastructure-as-code using tools such as Terraform or AWS CDK
- Experience implementing GitOps workflows using tools such as ArgoCD or similar
- Ability to influence technical decisions across teams and drive adoption of platform standards
- Knowledge of service mesh technologies such as Istio or Envoy
- Experience designing or operating multi-cluster Kubernetes architectures
- Experience with cloud networking at scale, including ingress/egress or edge platforms (e.g., Cloudflare)
- Knowledge of cloud security, identity, and compliance frameworks (e.g., IAM, SOC 2, CIS benchmarks)