H1 is a company dedicated to providing access to healthcare information globally, promoting health equity. They are seeking a Senior DevOps / AWS Cloud Engineer to design, scale, and operate their cloud infrastructure, ensuring high availability, security, and efficiency in a fast-paced environment.
Responsibilities:
- Architect, automate, and scale AWS infrastructure using Infrastructure-as-Code (IaC), enabling secure and reliable self-service for engineering teams
- Own and evolve CI/CD pipelines and deployment workflows across a growing portfolio of services and applications
- Lead efforts to improve system reliability, availability, observability, and performance across production environments
- Serve as a senior escalation point for infrastructure incidents, driving incident response, root cause analysis, and long-term remediation
- Design and implement monitoring, alerting, and logging strategies to proactively identify and resolve system issues
- Partner with engineering and data teams to support containerized workloads and Kubernetes-based services (Amazon EKS)
- Optimize cloud infrastructure for cost, performance, and scalability using metrics, forecasting, and usage analysis
- Champion cloud security best practices, IAM design, and infrastructure governance in partnership with security stakeholders
- Evaluate, recommend, and implement new tools and services that improve platform reliability and developer experience
- Mentor and support other DevOps engineers through technical guidance, reviews, and knowledge sharing
- Participate in a light on-call rotation and help mature operational processes and runbooks
Requirements:
- 8+ years of hands-on experience building and operating cloud infrastructure, with significant experience in AWS
- Deep expertise in AWS services, including compute, networking, storage, and security (EC2, ECS/EKS, VPC, IAM, RDS, etc.)
- Strong proficiency with Infrastructure as Code using tools such as Terraform and CloudFormation
- Extensive experience designing and maintaining CI/CD pipelines using tools such as GitHub Actions, GitLab CI, or CircleCI
- Solid experience supporting Kubernetes-based workloads, including Amazon EKS
- Strong experience with monitoring and observability stacks (Prometheus, Grafana, CloudWatch, ELK/OpenSearch)
- Proficiency in scripting and automation using Python and Bash
- Strong understanding of cloud security best practices, IAM, and incident response processes
- Experience leading or significantly contributing to incident response, postmortems, and reliability improvements
- A proven track record designing, implementing, and supporting highly available, 24x7 SaaS platforms in AWS
- Deep experience with cloud-native and containerized architectures in production environments
- Strong working knowledge of networking fundamentals (HTTP/S, SSH, VPNs, firewalls, VPC design)
- Experience contributing to security governance, infrastructure standards, and compliance initiatives
- Excellent communication and collaboration skills, with the ability to influence across engineering and non-engineering teams
- A team-first mindset and commitment to building inclusive, scalable engineering practices