Design, implement and own GitOps workflows using ArgoCD, including ApplicationSets/generator plugins and multi-environment, multi-cluster deployment patterns
Implement and manage policy-as-code using OPA/Gatekeeper and Kyverno to enforce security, compliance and best practices across clusters
Architect and operate an advanced service mesh with Istio, including traffic management, mutual TLS, zero-trust patterns, resilience and observability
Use Crossplane (XRDs/CRDs) and Kustomize to deliver fully declarative infrastructure and application configuration, enabling true platform-as-code
Build, maintain and optimize CI/CD pipelines with GitHub Actions, including reusable workflows, environment promotion, releases and rollbacks
Integrate Trivy, SonarQube and related tooling into pipelines to provide end-to-end SCA, image scanning and static analysis, ensuring secure-by-default delivery
Design, implement and operate scalable, resilient solutions on AWS, collaborating on architecture and making pragmatic, high-impact design decisions
Manage and optimize content delivery and edge security using AWS CloudFront and other CDNs, including WAF integration and DDoS mitigation
Implement and evolve observability using Grafana Cloud (Prometheus, adaptive metrics, logging and alerting), ensuring actionable insights and SLO-driven operations
Contribute to and at times lead incident response, post-incident reviews, and continuous improvement of reliability, performance and operability
Drive automation and self-service (e.g. GitOps + Crossplane + ArgoCD patterns) to reduce manual work and empower product teams to manage their own infrastructure
Collaborate closely with development, QA, security and operations teams to align on standards, simplify delivery and remove friction
Own documentation for the platform, patterns, runbooks and best practices, ensuring they are discoverable and up to date
Mentor and coach engineers in DevOps / platform practices, GitOps, Kubernetes and cloud; act as a role model for engineering excellence and ways of working
Engage effectively with technical and non-technical stakeholders, clearly explaining trade-offs, risks and options
Requirements
Strong understanding of cloud-native operations, Kubernetes platforms and infrastructure as code principles
4+ years’ experience running Kubernetes and Docker in production (high-load, high-availability environments)