Bloxstaking is the core team behind the SSV Network, pioneering decentralized infrastructure for Ethereum staking. As a Senior Site Reliability Engineer, you will work at the intersection of cloud infrastructure and blockchain, building reliable platforms for product teams and integrating AI into engineering workflows.

Responsibilities:

Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation
Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization
Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner
Work closely with product teams on crucial initiatives such as production deployments, release management, and incident handling, aiming for seamless operations
Offer technical expertise and input to support the continual adoption and modernization of our platform and infrastructure
Build and deploy AI-powered tooling (autonomous coding agents, LLM-assisted CI/CD, automated incident triage) that makes the engineering org more productive. Think: sandboxed environments where agents can write, test, and verify code without human babysitting
Foster a culture of continuous learning and improvement, encouraging constructive review and adaptation processes

Requirements:

Kubernetes expertise, with a strong understanding of its core concepts and the ability to manage and maintain clusters
Expertise within modern cloud native tools, e.g. ArgoCD for GitOps, Terraform/Crossplane for IaC, and the Grafana LGTM stack (Loki, Grafana, Tempo, Mimir) for observability
3-5 years of experience in using Infrastructure as Code and tools for cloud provisioning - Must
3-5 years of practice in development and scripting in languages like Go, Python, or similar - Must
Proficient in both written and spoken English, with exceptional communication abilities
Expertise when it comes to Linux environments, containerization, and cloud technologies
Comprehensive knowledge of production management concepts for distributed systems
A history of 3-5 years in operational roles, overseeing production settings
AI fluency. You use AI coding tools daily and have opinions about what works. More importantly, you can build and deploy LLM-powered developer tooling and autonomous agents, not just consume them. We want someone who thinks about how to make an entire engineering team more productive with AI
Networking knowledge: bonus points for service mesh experience, platform engineering and cross-cloud networking
Familiarity with the Ethereum ecosystem, staking, and blockchain technologies - Advantage

Senior Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: