AMISEQ is seeking a Site Reliability Engineer to manage foundational services that are critical to their fleet operations. The role involves automating rack provisioning, monitoring datacenter environments, and ensuring the health of the fleet while participating in on-call responsibilities with the team.

Responsibilities:

Own foundational services that serve as a core component of the fleet, such as DHCP, DNS, NTP, PXE
Build, test and keep the fleet up to date with the latest Operating System and Kernel
Own full-stack services which automate end to end rack provisioning, including:
Network device detection and provisioning
OS installation and config management
Tooling to monitor datacenter environments, such as power, temperature, humidity
Own services that monitor the health of our fleet and host remediations
Together with your team, you will share on-call rotation and responsibilities

Requirements:

BS in Computer Science or related technical field, or equivalent technical experience
3+ years of coding experience, Python and/or Golang preferred
1+ years of experience working with Linux in a production environment
Demonstrated skill developing distributed systems
Familiarity with fundamental services, such as DHCP, DNS, NTP, or PXE
Familiarity with config management tools, such as Ansible, Chef, or Puppet
Experience with using monitoring tools to maintain reliability of production services
3+ years of experience working with Linux
Experience with cloud compute services, such as AWS, Google Cloud, or Microsoft Azure
Nice to have: Experience with frontend development, such as Javascript frameworks like Angular or React

Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: