Home
Jobs
Saved
Resumes
Senior Site Reliability Engineer at qode.world | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Senior Site Reliability Engineer
qode.world
Website
LinkedIn
Senior Site Reliability Engineer
South Carolina, United States of America
Full Time
23 hours ago
No Visa Sponsorship
Apply Now
Key skills
Distributed Systems
Kafka
AI
ML
Dynatrace
Leadership
About this role
Role Overview
Design and implement unified observability dashboards across metrics, logs, traces, events, and topology
Define and manage SLIs, SLOs, and error budgets aligned to business outcomes
Build actionable dashboards for operations, engineering, and leadership
Implement alerting strategies using static and dynamic thresholds
Leverage AI/ML/AIOps to detect anomalies, predict incidents, and reduce MTTR
Transition monitoring from reactive alerts to proactive insights
Implement noise reduction, alert correlation, and root cause analysis
Apply baseline modeling, seasonality detection, and anomaly scoring
Monitor and troubleshoot multi-service architectures
Identify whether issues originate from upstream/downstream dependencies, streaming platform, infrastructure, or application code
Deep hands-on experience with Dynatrace (mandatory)
Requirements
15+ years in SRE / Production Engineering
Strong Unified Observability background (not infra-only)
Hands-on Dynatrace experience (metrics, traces, logs, Davis AI)
SLI/SLO engineering experience in production systems
Experience implementing dynamic thresholds and anomaly detection
Knowledge of AI/ML concepts applied to Ops (AIOps)
Distributed systems troubleshooting expertise
Experience with Kafka or streaming data platforms
Tech Stack
Distributed Systems
Kafka
Apply Now
Home
Jobs
Saved
Resumes