AMISEQ is seeking a Site Reliability Engineer to manage foundational services that are critical to their fleet operations. The role involves automating rack provisioning, monitoring datacenter environments, and ensuring the health of the fleet while participating in on-call responsibilities with the team.
Responsibilities:
- Own foundational services that serve as a core component of the fleet, such as DHCP, DNS, NTP, PXE
- Build, test and keep the fleet up to date with the latest Operating System and Kernel
- Own full-stack services which automate end to end rack provisioning, including:
- Network device detection and provisioning
- OS installation and config management
- Tooling to monitor datacenter environments, such as power, temperature, humidity
- Own services that monitor the health of our fleet and host remediations
- Together with your team, you will share on-call rotation and responsibilities
Requirements:
- BS in Computer Science or related technical field, or equivalent technical experience
- 3+ years of coding experience, Python and/or Golang preferred
- 1+ years of experience working with Linux in a production environment
- Demonstrated skill developing distributed systems
- Familiarity with fundamental services, such as DHCP, DNS, NTP, or PXE
- Familiarity with config management tools, such as Ansible, Chef, or Puppet
- Experience with using monitoring tools to maintain reliability of production services
- 3+ years of experience working with Linux
- Experience with cloud compute services, such as AWS, Google Cloud, or Microsoft Azure
- Nice to have: Experience with frontend development, such as Javascript frameworks like Angular or React