Oversee and enhance our cloud infrastructure, ensuring reliability, availability, and performance by applying SRE best practices.
Build and sustain robust monitoring and alerting systems, automating routine tasks and workflows.
Handle incident response and root cause analysis for production issues to improve system stability and to provide outstanding uptime.
Create and maintain detailed documentation for infrastructure and processes.
Ensure systems comply with security and regulatory standards on regular audits and assessments.
Define, lead and oversee technical projects from start to finish.
Collaborate with developers, and product team.
Develop flowcharts, layouts and documentation to identify requirements and solutions.
Work with and coach other DevOps Engineers through reviews, quality reviews, architecture discussions, and technical outlines to build infrastructure that is easy to maintain and update.
Transform legacy systems into enterprise software by applying standards and best practices.
Requirements
5+ years of experience as a DevOps Engineer or in a similar software engineering role, experience with Site Reliability Engineering (SRE) principles and practices.
Extensive experience with high-performance, scalable cloud infrastructure and containerization technologies such as Docker and Kubernetes.
Strong experience with monitoring, alerting and logging tools and practices.
Experience with Infrastructure as Code.
Experience working in a high-paced agile environment.
Experience in building complex systems and high-volume transaction applications.
Familiarity with software development, CI/CD pipelines and tools.
Ability to document requirements and specifications.
Ability to quickly troubleshoot complex issues.
A bias to action, desire for ownership and strong problem-solving attitude.
Work both in a team and alone and to manage your own workload.
You’re kind and empathetic, relying on and supporting the rest of the team for discussions and reviews.
You are able to craft a vision and clarify an ambiguous problem space into clear, tangible building blocks of a roadmap.
Tech Stack
Cloud
Docker
Kubernetes
Benefits
Be part of a forever remote first (not hybrid) company.
Great remuneration package with bonuses and equity for top performers.
Antavo Care (Private Health Insurance & Mental Health Support).
You can join AYCM SportPass.
A dynamic, no corporate-BS environment to learn, grow, and really make an impact.
You will have a strong team around you to support you in reaching your goals.