Fireblocks is a company focused on providing secure solutions for digital assets. The Site Reliability Engineer will improve monitoring and observability of services, handle critical alerts, and optimize system availability and performance for the digital asset custody and settlement platform.
Responsibilities:
- Research Fireblocks blockchain workflows, identify optimization opportunities, issues and improve monitoring
- Help Identify root causes for incidents and prevent them from happening again. Solve and orchestrate outages by working with multiple teams
- Improve and establish alerting for our infrastructure, services and business logic
- Work closely with the R&D and Support: offering education and guidance on integration, support, and monitoring across the toolset
- Communicate and escalate issues to senior management in R&D and support, write RCA’s, define next steps
- Document actions in runbooks and then into automation using Python, Lamda, shell scripts, ArgoCD, Ansible
- Focus on the system's observability, availability, reliability, performance/latency, monitoring
- Conduct periodic on-call duties and emergency response
Requirements:
- At least 3+ years of experience as SRE, Infra Backend in a SaaS environment
- You are curious, self-motivated, easy to work with, responsible and production aware. Fast learner and able to take a project from POC to production, while handling decision making and communication
- Experience with Coding languages - Python/JavaScript/Bash (Must)
- At least 3+ years of experience with Alerting & Monitoring systems such as DataDog Coralogix / Splunk / New Relic / Prometheus
- Experience working with Linux systems from kernel to shell and beyond
- Cloud systems such as AWS / Google cloud / Azure
- Configuration management such as Ansible/Chef/Puppet/ArgoCD
- Experience with Docker, Kubernetes and Helm
- SCM - Git/bitbucket/gitlab/Phabricator/gerrit
- High Analytical & Troubleshooting skills - ability to solve complex problems
- Strong verbal and written communication skills and a collaborative mindset
- Previous experience in cryptocurrencies lockchains - big advantage
- In Depth knowledge in: Linux optimization, nginx, ArgoCD, DataDog, MySql
- Participated in Kubernetes migration projects
- Previous experience as C++ or Node developer
- BSC in Computer Science or related technical certifications