Kraken is a mission-focused company rooted in crypto values, seeking to accelerate global adoption of crypto. As a Senior Site Reliability Engineer, you will manage infrastructure, improve CI/CD pipelines, and support operational excellence within the Payward Services unit.
Responsibilities:
- Manage and support infrastructure for Payward Services, including Nomad, Kubernetes, databases, and 3rd party system integration
- Provide operational support across multiple teams, helping debug issues in staging and production environments
- Participate in incident response and post-incident reviews to improve system resilience
- Consult with teams on performance, monitoring, and alerting best practices — with awareness of partner-facing SLA commitments
- Build tooling, automation, and dashboards to improve observability and empower development teams
- Maintain and troubleshoot CI pipelines, ensuring reliable and fast build, test, and deployment cycles
- Collaborate with developers, QA, and product managers to streamline development and release cycles
- Support a fully distributed team operating across multiple timezones
Requirements:
- 5+ years in DevOps or SRE role
- Proficiency with hybrid-cloud infrastructure environments
- Git source version-control and CI/CD configuration proficiency
- Deep understanding of monitoring and alerting systems, preferably Prometheus and Grafana
- Ability to debug complex distributed systems, networks, and Linux operating systems issues
- Containerization and orchestration experience (Docker, Nomad, Kubernetes a plus)
- Strong scripting skills (Bash, Python, or Go)
- Self-starter capable of thriving independently and remotely in fast-paced environments
- Background working with distributed systems and technologies (Kafka, gRPC, Redis, etc.)
- Experience operating services with external SLAs or in a B2B/enterprise context
- Experience with benchmarking, performance tuning, and identifying system bottlenecks
- Proficiency with databases (SQL and NoSQL) and production operations experience
- Interest in lower-level programming languages such as Rust
- Experience integrating with APIs (GitLab, Jira, Slack)