Design, deploy, and support AWS-based cloud platforms using best practices for high availability, fault tolerance, and security
Build and maintain Infrastructure as Code using tools such as Terraform or CloudFormation
Continuously optimize platform performance, scalability, and cost efficiency
Improve resilience and reliability across distributed systems
Design and maintain modern CI/CD pipelines using tools such as GitHub, Harness, CodePipeline, CodeBuild, and CodeDeploy
Drive automation of infrastructure provisioning and application deployments
Champion the adoption of AI-powered tooling to enhance pipeline optimization, intelligent testing, incident detection, and operational insights
Implement and improve monitoring, alerting, and logging solutions
Perform advanced troubleshooting across cloud and production environments
Participate in an on-call rotation to support critical systems
Lead technical initiatives from design through implementation and operational support
Requirements
2+ years of hands-on experience building and maintaining CI/CD pipelines and artifact management tools (GitHub, GitLab, Harness, Artifactory, CodePipeline, etc.)
5+ years of experience in DevOps, Site Reliability, or Systems Engineering roles
Strong expertise in AWS services and Linux-based environments
Proficiency in Python or similar scripting languages
Excellent communication skills
Experience with Infrastructure as Code (Terraform, CloudFormation, etc.)
Strong understanding of distributed systems, system architecture, and networking fundamentals
Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
Experience with Kubernetes and container orchestration