AMP is applying AI-powered sortation at scale to modernize the world's recycling infrastructure. The Site Reliability Engineer will ensure AMP's technology operates smoothly, supporting hundreds of robotic systems and developing tools for increased software observability.
Responsibilities:
- Triage and respond to tickets, adhering to SLAs from 9:00am - 5:00pm in the Eastern Time Zone
- Participation in the rotation of pager duty as we establish 24/5 escalation support for the facilities
- Provide support for CoreTech devices including commissioning support, software upgrades, tooling maintenance, and troubleshooting
- Troubleshoot operating system, on-prem hardware, networking, container, and application issues to the point of mitigation, resolution, or hand-off. All devices are on-prem in AMP facilities
- Maintain and extend documentation for the engineering support process
- Help define improvements to the Jira ticketing system for ease of use and analytics tracking
- Development tasks will be focused on increasing observability of software issues and creating mitigation tools to leverage when the software issues present
- When subject matter experts are called upon in escalations, it will be the job of this role to take those lessons learned and turn them into tools enabling facilities to better self-serve
Requirements:
- Strong technical communication skills for collaborating with the rest of the software team through ticket escalations
- Strong interpersonal skills for communicating with individuals in industrial environments experiencing downtime issues that can be overwhelming
- Experience troubleshooting Linux systems
- Desire to learn and gain experience writing code, including professional software engineering practices like coding standards, code reviews, source control management, build processes, testing, and operations
- The growth of facilities requires this role to become more efficient over time
- Proficiency managing task level scoping for yourself under a sprint based or kanban methodology
- Passion for green technology and emissions reduction
- Real world experience with deployed hardware
- Experience with Docker or similar technologies
- Experience troubleshooting to minimize mean time to recovery in downtime situations
- Comfort with reactive multitasking and rapid reprioritization