Leads the design, implementation and optimization of cloud-native solutions using AWS services such as EC2, Lambda, S3, RDS, EKS, and more and with a keen focus on cost management and optimization.
Troubleshoots complex infrastructure problems, often spanning multiple layers of the stack and requiring collaboration across teams.
Leads root cause analyses for incidents and participates in the team’s 24/7 on-call rotation.
Partners with Security and Compliance teams to ensure cloud environments meet regulatory requirements (e.g., PCI-DSS, SOC 2, FFIEC) and leads internal and external audit reviews collecting evidence and responding to audit requests in a timely manner.
Architects and implements disaster recovery plans ensuring that our systems are highly available, fault-tolerant, and able to recover from failure.
Implements and enforces cloud security best practices including IAM policies, VPC design, encryption and key management (KMS).
Drives the adoption and implementation of Infrastructure as Code (IAC) using Terraform and leveraging CI/CD best practices.
Oversees the cloud vulnerability management and configuration program ensuring patching meets our Service Level Agreements (SLAs).
Maintains comprehensive technical documentation of cloud infrastructure, policies, and procedures.
Designs cloud infrastructure for robustness, security, and observability, including logging and alerting solutions (e.g., Prometheus, Grafana, ELK Stack, Splunk).
Provides leadership and technical expertise in support of building a technical plan and backlog of stories, then follows through on execution of the design and build process through to production delivery.
Supports the company’s commitment to risk management and protect the integrity and confidentiality of systems and data.
Requirements
Education and/or experience typically obtained through a bachelor's degree in computer science or a related technical field.
At least 8 years in Cloud Engineering or DevOps roles, with 3+ years of hands-on experience in AWS.
Deep expertise in AWS core services (EC2, EKS, Lambda, RDS, S3, IAM) and strong knowledge of Kubernetes (EKS) and Linux environments.
Hands-on Docker experience.
Demonstrated experience in delivering business-critical systems to the market.
Ability to influence and work in a collaborative team environment.
Knowledge of mature engineering practices (CI/CD, testing, secure coding, etc.), plus software development methodologies (Agile, Scrum, LEAN) and DevOps practices.
Proven track record implementing security & compliance controls aligned with frameworks like PCI-DSS, SOC 2, or FFIEC.
Experience maintaining Infrastructure as Code with tools such as Terraform, AWS CloudFormation, or Ansible.
Experience in building and automating observability at scale for complex transactional systems.
High level of customer responsiveness, excellent documentation and communication skills, attention to detail, clear understanding of how to prioritize work and deal with ambiguity.
Current AWS (and/or other cloud-based) certification(s).
Tech Stack
Ansible
AWS
Cloud
Docker
EC2
Grafana
Kubernetes
Linux
Prometheus
Splunk
Terraform
Benefits
Healthcare Coverage – Competitive medical (PPO/HDHP), dental, and vision plans as well as company contributions to your Health Savings Account (HSA) or pre-tax savings through flexible spending accounts (FSA) for commuting, health & dependent care expenses.
401(k) Retirement Plan – Featuring a 100% Company Safe Harbor Match on your first 6% deferral immediately upon eligibility.
Paid Time Off – Unlimited Time Off for Exempt (salaried) employees, as well as generous PTO for Non-Exempt (hourly) employees, plus 11 paid company holidays and a paid volunteer day.
12 weeks of Paid Parental Leave
Maven Family Planning – provides support through your Parenting journey including egg freezing, fertility, adoption, surrogacy, pregnancy, postpartum, early pediatrics, and returning to work.