PrizePicks is the fastest-growing sports company in North America, recognized for its leading platform in Daily Fantasy Sports. The Senior Site Reliability Engineer will ensure the reliability, scalability, and performance of the infrastructure, while also leading incident response and mentoring other engineers.

Responsibilities:

Design, implement, maintain, and monitor reliable production systems at scale
Lead incident response, mitigate production issues, and conduct post mortem analysis
Proactively monitor performance, analyze system failures, identify bottlenecks, and propose solutions
Create and support observability/monitoring tools and vendor integrations
Drive the growth of a reliability culture, promoting cross-functional collaboration towards improving system reliability, scalability, resilience, and security
Train and mentor other engineers

Requirements:

5+ years of experience as a reliability-focused engineer in a fast-paced, rapidly growing, enterprise environment
Deep understanding of tooling and application development in these areas: Cloud computing such as AWS, Azure, and/or GCP
Infrastructure as code tools such as terraform or crossplane
Developing applications in languages such as python, ruby, or go
Deploying and supporting applications in Kubernetes at scale
Implementing monitoring in tools like grafana, new relic, or datadog
Experience debugging live, critical production issues
Familiarity with reliability principles, such as resilient systems, application and supply chain security, and SLO governance
Ability to work cross-functionally with diverse engineering teams
Candidates based in Atlanta are preferred, but open to qualified applicants from anywhere in the U.S

Senior Site Reliability Engineer (SRE)

Key skills

About this role

Responsibilities:

Requirements: