Home
Jobs
Saved
Resumes
Site Reliability Engineer at StarTree | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Site Reliability Engineer
StarTree
Remote
Website
LinkedIn
Site Reliability Engineer
India
Full Time
2 hours ago
Visa Sponsorship
Apply Now
Key skills
Apache
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Java
Kafka
Kubernetes
Pulsar
Spark
GCP
Google Cloud
Critical Thinking
About this role
Role Overview
Leverage various monitoring and alerting services to solve intricate programming problems at scale.
Manage and tune multiple critical customer-facing Apache Pinot clusters
Monitor availability, read/write latencies, and other key telemetry to proactively identify SLO misses and help mitigate issues
Build a rapport with and work closely with customers to mitigate and resolve incidents
Execute disaster recovery strategies with minimal downtime
Collaborate with other engineers to understand and troubleshoot systems and use the experience gained to influence the roadmap of other teams
Requirements
5+ years of experience as an engineer (SRE, SDET, or development)
Experience managing highly available production facing distributed systems and in-depth knowledge of Java are a plus
Experience with cloud platforms such as AWS, GCP, or Azure
Experience with Kubernetes and container orchestration
Familiarity with streaming systems, such as Kafka, Pulsar, Flume, Flink, Spark, or similar
Knowledge of standard methodologies related to security, performance, and disaster recovery
Strong troubleshooting and critical thinking skills
Tech Stack
Apache
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Java
Kafka
Kubernetes
Pulsar
Spark
Benefits
Health insurance
Flexible work arrangements
Professional development
Apply Now
Home
Jobs
Saved
Resumes