Lead the design, build, and rollout of an internal developer platform on top of AWS, EKS, Argo CD, and GitHub Actions that lets engineers create, deploy, and operate services with minimal friction.
Own and evolve our service templates, Helm chart conventions, and Argo CD App-of-Apps patterns so that adding or migrating a service is a guided, low-risk experience.
Build and maintain reusable GitHub Actions workflows (build / push / scan, frontend build / deploy, SonarQube scans, semantic release) and improve CI feedback loops, build times, and caching.
Define and enforce platform standards for observability — structured logs into Loki, metrics into Prometheus / Mimir, dashboards in Grafana, and SLOs / alerts wired in by default.
Build self-service tooling around environments, secrets, feature flags, and access — so that the right thing is easy and the wrong thing is hard to do by accident.
Own the developer-facing aspects of identity and access (Auth0, IdP integrations, Tailscale access, IRSA / service accounts) and keep onboarding and offboarding smooth.
Partner with DevOps on infrastructure changes, with Cloud Security on guardrails, and with backend / frontend / data / firmware teams to understand their pain points and prioritize platform investments.
Mentor engineers across the org on platform conventions, lead design reviews for new services, and push back on patterns that don’t scale.
Treat the platform as a product: gather feedback, define roadmaps, write docs, and measure adoption and reliability.
Requirements
5+ years in Platform Engineering, DevOps, or SRE roles, including significant experience building and shipping developer-facing tooling for other engineering teams.
Track record of owning and delivering platform initiatives end-to-end, from design through adoption, with limited day-to-day supervision.
Strong working knowledge of Kubernetes (EKS or similar) and GitOps workflows with Argo CD or Flux.
Hands-on experience with Infrastructure as Code using Terraform; comfort with Terragrunt or a similar wrapper.
Solid experience with CI/CD systems, ideally GitHub Actions, including reusable / composable workflows and release automation.
Working knowledge of AWS core services (EKS, EC2, RDS, S3, IAM, VPC, ECR) and how to compose them into reliable, secure platforms.
Experience designing developer abstractions — Helm charts, service templates, internal CLIs, scaffolding tools, or Backstage-style portals — that other engineering teams easily interact with.
Strong programming skills in Python, Bash, or TypeScript for building tooling and automation.
Experience integrating observability (Grafana, Loki, Prometheus / Mimir, OpenTelemetry, or similar) as a default rather than an afterthought.
Strong written communication skills, with a habit of writing docs, runbooks, and wikis that engineers can actually use.
Tech Stack
AWS
Cloud
EC2
Flux
Grafana
Kubernetes
Prometheus
Python
Terraform
TypeScript
Benefits
Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
Paid parental leave
Alternating day off (every other Monday)
“Off the Grid”, a two week per year paid break for all employees.