Bloxstaking is the core team behind the SSV Network, pioneering decentralized infrastructure for Ethereum staking. As a Senior Site Reliability Engineer, you will work at the intersection of cloud infrastructure and blockchain, building reliable platforms for product teams and integrating AI into engineering workflows.
Responsibilities:
- Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation
- Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization
- Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner
- Work closely with product teams on crucial initiatives such as production deployments, release management, and incident handling, aiming for seamless operations
- Offer technical expertise and input to support the continual adoption and modernization of our platform and infrastructure
- Build and deploy AI-powered tooling (autonomous coding agents, LLM-assisted CI/CD, automated incident triage) that makes the engineering org more productive. Think: sandboxed environments where agents can write, test, and verify code without human babysitting
- Foster a culture of continuous learning and improvement, encouraging constructive review and adaptation processes
Requirements:
- Kubernetes expertise, with a strong understanding of its core concepts and the ability to manage and maintain clusters
- Expertise within modern cloud native tools, e.g. ArgoCD for GitOps, Terraform/Crossplane for IaC, and the Grafana LGTM stack (Loki, Grafana, Tempo, Mimir) for observability
- 3-5 years of experience in using Infrastructure as Code and tools for cloud provisioning - Must
- 3-5 years of practice in development and scripting in languages like Go, Python, or similar - Must
- Proficient in both written and spoken English, with exceptional communication abilities
- Expertise when it comes to Linux environments, containerization, and cloud technologies
- Comprehensive knowledge of production management concepts for distributed systems
- A history of 3-5 years in operational roles, overseeing production settings
- AI fluency. You use AI coding tools daily and have opinions about what works. More importantly, you can build and deploy LLM-powered developer tooling and autonomous agents, not just consume them. We want someone who thinks about how to make an entire engineering team more productive with AI
- Networking knowledge: bonus points for service mesh experience, platform engineering and cross-cloud networking
- Familiarity with the Ethereum ecosystem, staking, and blockchain technologies - Advantage