DigitalOcean is a cutting-edge technology company focused on simplifying cloud solutions. They are seeking an entry-level Systems Engineer I to optimize and troubleshoot data center hardware, qualify and deploy firmware packages, and engage with vendors to enhance their ecosystem.
Responsibilities:
- Work with vendors and internal peer teams on qualifying, onboarding, and delivering new firmware to the DigitalOcean ecosystem
- Act as Tier 3 escalation on-call for triage, investigation, and resolution of system firmware issues in the DigitalOcean fleet (both customer-facing and internal)
- Participate in 24/7 on-call rotation with other members of the teamImprove existing firmware and hardware configuration automation/validation, for both hardware platforms and components (such as NIC, Storage and BMC)
- Engage with hardware vendors about new automation features and existing bugsHelp with development of tooling and associated runbooks to address gaps in operational capabilities around hardware and firmware operations
- Coordinate with Ops teams on monitoring thresholds, failure modes and alertingAssist in troubleshooting causes of failures and work to prevent them in the future
- Raise the quality bar in the delivery of our cloud infrastructure by identifying industry best practices and working to adopt them
Requirements:
- Technical Degree (BS Computer Science/Engineering) or equivalent practical experience
- Strong understanding of x86 server hardware architecture and subsystems
- Demonstrated professional proficiency in configuration management best-practices (we use Ansible and Chef)
- Experience automating server firmware components at large-scale using industry-standard tooling (Redfish, IPMI, etc) including a deep understanding of benchmarking, automating test frameworks, and process automation in general
- Practical knowledge of PXE boot, UEFI, Linux/OS boot, AMI/OEM BIOS distributions, OpenBMC/AMI/OEM BMC implementations, RAID and other storage resiliency technologies, and the full Network stack- from NIC firmware to TCP/IP
- Adept at Linux (or Unix) operating systems
- Comfortable with version control systems (we use Git) and proficient in at least one programming language (such as Python or Go)
- Ability to participate in 24/7 on-call rotation with other members of the team
- Excellent communication skills, both within the team and with the broader company
- Have an insatiable passion for hardware, both new and old
- Ideally, you've worked with non-x86 hardware too!