Architect and Manage Cloud Infrastructure: Design, deploy, and maintain highly available and scalable environments within GCP.
Infrastructure as Code (IaC): Drive the adoption of Terraform to manage all infrastructure through code, ensuring environment consistency and version control.
Cloud Operations & Monitoring: Expand the observability capabilities of the platform and own the day-to-day health of production environments.
Collaborative DevOps: Partner closely with engineers to improve the developer experience, assisting with containerization (Docker/K8s) and deployment strategies.
CI/CD Automation: Build and optimize robust CI/CD pipelines to streamline application delivery, integrating automated testing and deployment gates.
Secure Infrastructure Design: Partner with security engineering to implement cloud hardening standards, IAM architecture and policies, secrets management, and overall Cloud infrastructure and security best practices.
Performance Optimization: Analyze cloud spend and resource utilization to optimize for cost and performance across all environments.
Automated Remediation: Develop scripts and playbooks to automate routine operational tasks and self-healing infrastructure.
Requirements
5+ years of hands-on experience in a Cloud Engineering, SRE, or DevOps role, preferably within a cloud-native SaaS, startup environment.
Public Cloud Expertise: Strong proficiency in managing public cloud environments, with a primary focus on Google Cloud Platform (GCP).
Automation Mindset: Proven experience with Terraform, Ansible, or Pulumi for infrastructure automation.
Cloud Operations Expertise: Demonstrated experience managing production workloads at scale, including log aggregation, proactive alerting, and incident response.
Operational Mindset: A safety-first approach to changes, with experience in canary deployments, blue-green releases, and automated rollbacks.
Containerization: Solid understanding of Docker and orchestration platforms like Kubernetes (GKE).
Pipeline Experience: Hands-on experience building and maintaining CI/CD pipelines (e.g., GitHub Actions, GitLab CI, or Jenkins).
Scripting: Proficiency in at least one scripting language (e.g., Python, Bash) to automate manual processes.
Collaborative Problem-Solver: Ability to work across Engineering, Security, and Product teams to translate business requirements into technical architecture.
Tech Stack
Ansible
Cloud
Docker
Google Cloud Platform
Jenkins
Kubernetes
Python
Terraform
Benefits
100% Remote Company, within the USA
Comprehensive Medical, Dental, and Vision plans with a 100% employer-paid monthly premium option for employees & 50% employer-paid monthly premiums for dependents.
Health Savings Account with company contribution for eligible medical plans.
Flexible Vacation Plan
10 Paid Company Holidays
100% employer-paid Life, AD&D and Short
and Long-Term Disability Insurance
401k with Traditional and Roth options, including employer match.
Company Equity
Paid Parental and Pregnancy Recovery Leave
Company and team off-sites and virtual events throughout the year