Improve reliability and provide day-to-day AWS operational support — owning incident, alert, and request triage, on-call response, and operational health of the client's AWS estate
Collaborate closely with client platform and engineering teams to stabilize and improve production
Act as first responder for the client's AWS environment — triage, diagnose, and resolve incidents and service requests within agreed SLAs
Own alert handling across CloudWatch, GuardDuty, Security Hub, and AWS Health
Manage the operational ticket queue (incident, request, problem, and change) and drive problem management
Design, implement, and maintain scalable CI/CD pipelines for automated testing, deployment, and provisioning
Manage infrastructure as code with Terraform, CloudFormation, and CDK
Collaborate with software teams to integrate and deploy backend services and containerised applications
Ensure compliance with relevant standards (ISO 27001, SOC 2, POPIA / GDPR) per client requirements
Requirements
7+ years in Cloud DevOps, SRE, or AWS operational support, with strong CI/CD and infrastructure automation experience
Hands-on experience running AWS managed support / operations — incident management, alert triage, on-call, and SLA-bound resolution
Deep AWS proficiency: EC2, ECS/EKS, Lambda, S3, DynamoDB, RDS, VPC, Route 53, CloudFront, IAM (Identity Center), CloudWatch, X-Ray
Minimum Requirements: Matric (Grade 12) certificate; Bachelor's degree in Computer Science, IT, Engineering, or related field (or equivalent practical experience); At least one relevant professional certification (AWS DevOps Engineer, Solutions Architect, SysOps Administrator, CKA, or equivalent)
Tech Stack
AWS
Cloud
DNS
Docker
DynamoDB
EC2
Grafana
ITSM
Jenkins
Kubernetes
Prometheus
Python
Ray
Terraform
Vault
Benefits
Strong culture of learning
Internal speaking and sponsored technical events across the AWS ecosystem