Function as an individual contributor within the team: actively collaborating with peers through thorough code reviews, providing constructive support and mentorship, and contributing to a unified technical direction for the platform.
Architect, Design, and Implement Infrastructure as Code (IaC): You will treat our infrastructure as a sophisticated software system, responsible for its comprehensive lifecycle management using Terraform.
Deploy, Manage, and Optimize Kubernetes Clusters on GCP (GKE) and AWS (EKS): You will take ownership of the deployment, configuration, and ongoing maintenance of our Kubernetes clusters on GCP Google Kubernetes Engine (GKE) and AWS Elastic Kubernetes Service (EKS) kit.
Develop and Refine Internal Software Delivery Systems (CI/CD): You will design, implement, and maintain robust Continuous Integration/Continuous Deployment (CI/CD) software specifically tailored for our platform components.
Diagnose, Troubleshoot, and Resolve Platform-Related Issues: You will be the primary point of contact for diagnosing and resolving platform-related issues, including performance bottlenecks, scalability challenges, and security vulnerabilities.
Drive Automation Initiatives to Streamline Operational Tasks and Enhance System Reliability: You will champion automation initiatives to eliminate manual operational tasks, reduce human error, and improve overall system reliability.
Act as a Strategic Partner to Development Teams, Understanding and Addressing Their Infrastructure Needs: You will foster strong relationships with feature teams, treating them as your internal customers.
Drive Development of Internal Platform Product and Services: You will actively participate in the full software development lifecycle (design, code, test, and deploy) of internal tools and APIs.
Implement and Enforce Rigorous Security Best Practices and Ensure Compliance with Industry Standards: You will be responsible for implementing and enforcing robust security best practices across our platform.
Requirements
Strong software engineering fundamentals with proven hands-on coding experience
6+ years of proven experience in platform engineering, DevOps engineering, or related roles, with a strong track record of building and maintaining complex cloud infrastructure.
Strong hands-on experience with GCP/AWS, Kubernetes (GKE/EKS), and Terraform.
Demonstrated expertise in building and maintaining scalable, reliable, and secure cloud infrastructure, with a focus on automation and efficiency.
Strong Software Engineering fundamentals and demonstrated coding proficiency in Go or Typescript (or other relevant languages), including experience with data structures, algorithms, and defensive programming.
Proven experience with CI/CD tools, such as Argo CD, Atlantis, or similar technologies, and a deep understanding of CI/CD principles and best practices.
Understanding of networking concepts and protocols.
Extensive experience with monitoring and logging tools, such as Prometheus, Grafana, and the ELK stack, and a proven ability to use these tools to diagnose and resolve performance issues.
Knowledge of security best practices for cloud environments.
Excellent communication skills in English, both written and verbal.
Self-organized, goal-oriented, and self-motivated.
Ability to work effectively in a remote and distributed team environment.
Prior experience working specifically on platform engineering projects.