Fireblocks is a company that provides a secure platform for digital assets, trusted by major financial institutions. The Site Reliability Engineer (SRE) will improve monitoring and observability of services, handle critical incidents, and work closely with R&D to enhance system reliability and performance.
Responsibilities:
- Research Fireblocks blockchain workflows, identify optimization opportunities, issues, and improve monitoring
- Help identify root causes for incidents and prevent them from happening again. Solve and orchestrate outages by working with multiple teams
- Improve and establish alerting for our infrastructure, services, and business logic
- Work closely with the R&D and Support: offering education and guidance on integration, support, and monitoring across the toolset
- Communicate and escalate issues to senior management in R&D and support, write RCAs, and define next steps
- Document actions in runbooks and then into automation using Python, Lambda, shell scripts, ArgoCD, and Ansible
- Focus on the system's observability, availability, reliability, performance/latency, and monitoring
- Conduct periodic on-call duties and emergency response
Requirements:
- At least 3+ years of experience as an SRE, Infra Backend in a SaaS environment
- You are curious, self-motivated, easy to work with, responsible, and production-aware—fast learner and able to take a project from POC to production, while handling decision-making and communication
- Experience with Coding languages - Python/JavaScript/Bash (Must)
- At least 3+ years of experience with Alerting & Monitoring systems such as DataDog, Coralogix / Splunk / New Relic / Prometheus
- Experience working with Linux systems from kernel to shell and beyond
- Cloud systems such as AWS / Google Cloud / Azure
- Configuration management, such as Ansible/Chef/Puppet/ArgoCD
- Experience with Docker, Kubernetes, and Helm
- SCM - Git/bitbucket/gitlab/Phabricator/gerrit
- High Analytical & Troubleshooting skills - ability to solve complex problems
- Strong verbal and written communication skills and a collaborative mindset
- Previous experience in cryptocurrencies lockchains - a big advantage
- In-depth knowledge in: Linux optimization, nginx, ArgoCD, DataDog, MySQL
- Participated in Kubernetes migration projects
- Previous experience as a C++ or Node developer
- BSC in Computer Science or related technical certifications