Home
Jobs
Saved
Resumes
Site Reliability Engineer at Cayuse Holdings | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Site Reliability Engineer
Cayuse Holdings
Website
LinkedIn
Site Reliability Engineer
United States
Full Time
3 hours ago
$120,000 - $160,000 USD
No H1B
Apply Now
Key skills
Ansible
Docker
Grafana
Kubernetes
Prometheus
Python
Splunk
Terraform
Go
Bash
Datadog
Communication
About this role
Role Overview
Ensure the high availability and reliability of critical systems and services across production and development environments.
Monitor and improve system latency, performance, and overall efficiency through proactive measures and tuning.
Conduct performance benchmarking, capacity analysis, and optimize workloads for scalability and cost-effectiveness.
Act as the first line of defense for incidents, managing and resolving emergencies and minimizing downtime.
Implement post-incident reviews and root cause analyses to enhance system reliability and prevent recurring issues.
Manage system changes, ensuring they are properly planned, tested, and implemented with minimal risk.
Develop automation tools and scripts to eliminate manual, repetitive tasks and improve operational efficiency.
Design and implement robust monitoring and alerting solutions to detect anomalies and resolve issues proactively.
Collaborate closely with development teams to embed reliability principles into the application development lifecycle.
Requirements
Level 1: 1-3 years of experience in the field or in a related area.
Level 2: 4-7 years of experience in the field or in a related area.
Level 3: 8 or more years of experience.
Exceptional interpersonal skills with the ability to communicate in a clear, professional, and articulate manner.
Exceptional verbal and written communication skills.
Excellent organizational, analytical, and problem-solving skills with high-level attention to detail.
Strong multitasking skills with the ability to manage multiple design streams across concurrent work effort.
Must be self-motivated and able to work well independently as well as on a multi-functional team.
Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
Proficiency with monitoring and observability tools (e.g., Grafana, Prometheus, Datadog, Splunk).
Hands-on experience with automation tools (e.g., Terraform, Ansible) and coding/scripting languages, such as Python, Go, or Bash.
Solid understanding of containerization and orchestration technologies (e.g., Docker, Kubernetes).
Tech Stack
Ansible
Docker
Grafana
Kubernetes
Prometheus
Python
Splunk
Terraform
Go
Benefits
Medical, Dental and Vision Insurance
Wellness Program
Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
Short-Term and Long-Term Disability options
Basic Life and AD&D Insurance (Company Provided)
Voluntary Life and AD&D options
401(k) Retirement Savings Plan with matching after one year
Paid Time Off
Apply Now
Home
Jobs
Saved
Resumes