Vultr is a leading cloud infrastructure company on a mission to provide high-performance cloud solutions. They are seeking a highly skilled Storage Operations Engineer to build, maintain, and operate their Cloud Storage environments, focusing on enhancing system stability, scalability, and performance.
Responsibilities:
- Continually enhance the Ceph-based Cloud Storage System in terms of stability, scalability, functionality, performance and cost
- Actively participate in the design and architecture decisions for Ceph-based Cloud Storage
- Develop automation framework and improvements for our Ceph-based Cloud Storage infrastructure
- Optimize existing metrics collection and alerting systems
- Contribute to our internal documentation and knowledge base
- Monitor Cloud Storage infrastructure to ensure high availability and reliability of our services
- Partner with Network and Operations teams for capacity expansion and maintenance windows
- Evaluate new technologies, advance storage features and new operational methodology to drive efficiency and effectiveness
- Research and Develop cutting edge features for Ceph-based Cloud Storage
- Participate in meetings as required
- Job shadowing of new hires or other team members as required
- Effectively communicate cross-functionally
- Work independently as well as collaborate with team members and project stakeholders
Requirements:
- Problem solving skills using foundational data structures and distributed systems concepts
- Fluent in Linux (CentOS/RHEL/Debian/Ubuntu)
- 3+ years and extensive experience configuring, deploying, and managing distributed systems
- Hands on experience with cloud hosting, object storage, and block storage concepts
- Experience with performance tuning in virtualized environments
- Knowledge of server hardware including IPMI and storage drives (SATA, NVMe)
- Strong automating and scripting experience (Python/Bash/PHP/etc)
- Hands on experience with automation tools (Puppet/Ansible/Chef/Salt/etc)
- Effective Communication skills
- Excellent time management skills
- The ability to manage work independently
- Strong collaboration skills
- Experience with Ceph or other software defined storage a plus
- Experience with logging, metrics, and time series: Prometheus, collectd, ELK, Grafana, Graylog, Graphite, etc. a plus