Podium is a company that brings AI Employees to local businesses, enabling them to turn conversations into revenue. The Senior Site Reliability Engineer will ensure the stability, scalability, and performance of Podium’s platform while collaborating with engineering teams and mentoring junior engineers.
Responsibilities:
- Work with technologies including Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Honeycomb, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab/GitHub, and CI/CD pipelines
- Collaborate across Podium’s engineering community to identify areas for improvement, enhance reliability, and create a safer, more efficient system
- Participate in an on-call rotation, triaging and resolving production and development issues
- Partner with cross-functional teams to minimize downtime and ensure platform resilience
- Mentor junior engineers, fostering growth and technical excellence
Requirements:
- Bachelor's degree in a technical field or equivalent experience
- 6+ years experience supporting production systems in a software or systems engineering role
- 3+ years deploying, operating, and debugging server software on Linux
- Strong curiosity and a desire to learn continuously
- Willingness to participate in on-call rotations
- Experience with distributed systems and microservices
- Knowledge of system design principles
- Hands-on experience with cloud computing (AWS, GCP, or Azure)
- Familiarity with SOC2, HIPAA, PCI, or similar compliance frameworks
- Experience building and maintaining CI/CD pipelines
- Deep expertise in infrastructure engineering