AlphaSense is a market intelligence company that provides insights through AI-driven technology. They are seeking a highly experienced Staff Site Reliability Engineer to enhance the reliability, scalability, and performance of their platform, while mentoring other engineers and driving the adoption of SRE best practices.
Responsibilities:
- Architect Reliability Paved Paths: Build frameworks and self-service tooling that let teams own the reliability of their services in a 'You Build It, You Run It' culture
- Lead AI-Driven Reliability: Drive our AIOps strategy — automating diagnostics, remediation, and proactive failure prevention
- Champion Reliability Culture: Embed SRE practices across engineering via design reviews, production readiness, and operational standards
- Incident Leadership: Act as Incident Commander during critical events, modeling operational excellence, and ensuring blameless postmortems lead to lasting improvements
- Advance Observability: Deliver end-to-end monitoring, tracing, and profiling (Prometheus, Grafana, OTEL, Continuous Profiling) to optimize performance proactively
- Mentor & Multiply: Elevate engineers across SRE and product teams through mentorship, technical guidance, and knowledge sharing
Requirements:
- 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 3+ of those years operating in a Senior+ SRE position
- Strong background in running production SaaS systems at scale
- Proficiency in at least one programming/scripting language (Python, Go, or similar)
- Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes
- Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing)
- Experience with monitoring & alerting (Prometheus, Grafana, Datadog, ELK)
- Familiarity with advanced observability (OTEL, continuous profiling)
- Proven incident management experience, including leading high-severity incidents and postmortems
- Strong troubleshooting skills across the full stack
- Excellent communication and collaboration skills