Authority Partners is a leading global IT services company with over 27 years of experience. They are seeking a Senior DevOps / Platform Engineer to manage infrastructure provisioning and CI/CD for an AI Native platform, focusing on AWS and Kubernetes.
Responsibilities:
- Own all infrastructure provisioning, CI/CD, observability, and disaster recovery on a complex AI Native platform engagement
- Cover the full AWS/Kubernetes stack from Day 1
- Review and approve AI-generated IaC rather than writing everything manually
- Manage secrets management, security group configuration, and IAM role assignments
Requirements:
- 5+ years of DevOps/platform engineering with AWS production experience: EKS cluster management, RDS PostgreSQL multi-AZ, ElastiCache Redis, OpenSearch Service, SQS/SNS, S3 + CloudFront
- Terraform and Helm at production depth: multi-environment module structure, state backend configuration, ability to review AI-generated Terraform plans for unnecessary resource replacements, missing lifecycle rules, and IAM permission sprawl
- GitHub Actions CI/CD pipeline authorship: multi-stage pipelines with SAST gate integration, Docker image build and ECR push, Helm deploy with environment protection rules for production promotion
- SAST tooling (Semgrep, SonarQube, Snyk, or equivalent): writing custom rules for TypeScript monorepos, triaging findings at PR volume
- Secrets management: External Secrets Operator + AWS Secrets Manager, IRSA for Kubernetes service accounts, zero plaintext secrets in Git
- Grafana dashboard authorship for production SLOs: RED metrics, database pool utilisation, Redis hit/miss ratio, latency, and uptime alerting
- Live RDS multi-AZ failover drills with measured actual RTO
- DR runbook authorship for complex failure scenarios, particularly distributed identity/IAM cluster failure
- OpenTelemetry instrumentation for Node.js: auto-instrumentation, manual span creation, OTLP exporter configuration, trace context propagation
- Container security: Trivy or Snyk container scanning in CI/CD, Dockerfile multi-stage build optimisation, non-root container user enforcement
- GDPR infrastructure compliance: EU region data residency enforcement, S3 cross-region replication restrictions, CloudTrail audit log retention
- Experience reviewing AI-generated Terraform/Helm for security issues: overly broad IAM permissions, missing PodDisruptionBudgets, incorrect resource limits, missing encryption at rest