D3 is building the world’s first purpose-built blockchain for DomainFi, revolutionizing the ownership and trading of domain names in the digital economy. They are seeking a highly skilled Site Reliability Engineer to maintain system reliability, performance, and security while collaborating with Development and Product teams to support business growth.
Responsibilities:
- Maintain, troubleshoot, and optimize Kubernetes environments using HELM and kubectl
- Perform system-level troubleshooting and administration on Linux-based systems
- Manage and troubleshoot networking, DNS, routing, and VPN configurations
- Configure and manage firewall rules across cloud and on-premises environments
- Debug and resolve infrastructure and application issues with strong analytical skills
- Monitor system performance and reliability by implementing and managing monitoring solutions
- Identify and address database performance issues, including query optimization and caching improvements
- Continuously improve security procedures to prevent downtime and data loss
- Collaborate closely with Development and Product teams to support system scalability and uptime
Requirements:
- 3+ years of experience as a SRE or DevOps Engineer
- 3+ years of experience with Kubernetes (and tools like HELM, kubctl, etc.)
- Experience with Linux administration and troubleshooting
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience)
- Exceptional attention to detail, especially in system configurations and deployment processes
- Willingness to dig into a sometimes poorly defined problem to fully understand
- Ability to propose solutions and implement those solutions in order to move the business forward
- Work with Development and Product to ensure high uptime