Act as a subject matter expert, defining and driving architectural enhancements, design, and integration solutions across the platform utilizing serverless and containerization technologies
Take ownership of designing and building reliable systems, ensuring high standards for durability, availability, disaster recovery, and performance
Define, monitor, and enforce SLIs, SLOs, and SLAs, and establish proactive alerting and real-time dashboards for deep insights into system health
Work with the developer experience team to develop and evolve CI/CD workflows, automating tasks across the full software development lifecycle
Serve as a top level technical mentor to senior and junior engineers, promoting and enforcing methodologies and best practices
Partner with engineering, product, customer success, and other teams to address complex technical challenges
Requirements
8+ years of experience in a DevOps, SRE, or Infrastructure Engineering role
Deep expertise in cloud-native technologies, particularly with container orchestration (Kubernetes, ECS) and Infrastructure as Code (Terraform)
Expertise in implementing monitoring and observability solutions, including tools like Grafana, Coralogix, AWS CloudWatch, and Open Telemetry (OTEL)
Proven experience with CI/CD tools (e.g., Jenkins, GitHub)
A solid understanding of advanced AWS networking, security, and infrastructure concepts
Expertise in scripting and automation using Python, Bash, or Go
Strong ability to communicate complex technical concepts clearly to both technical and non-technical stakeholders
Self-motivated and capable of working autonomously
An analytical thinker with strong attention to detail
A natural leader who enjoys guiding and upskilling other engineers