SpotOn is a company focused on providing independent restaurants with the technology they need to succeed. The Platform Engineering Manager will lead the Client Environments team to enhance the reliability and observability of restaurant networks and devices, ensuring operational excellence and proactive issue resolution.
Responsibilities:
- Lead and mentor engineers across Network and Android (Elo) systems — building a strong culture of ownership and reliability
- Drive GitOps adoption for network and device configuration, ensuring deployments are consistent, testable, and reversible
- Oversee MDM and device lifecycle management (Elo tablets, Android handhelds), ensuring clean provisioning and policy enforcement
- Run the operational loop: stay close to client incidents, analyze recurring issues, and drive root-cause elimination through system changes, automation, and better visibility
- Collaborate with Core Services (Device Registry, MDM, Sidecar) and NOC to improve observability, alerting, and response workflows
- Standardize configurations and rollout models (base + overlays) to eliminate variance across restaurant networks
- Design for resilience: enable cellular failover, LTE monitoring, and automated recovery patterns through controllers
- Own service quality metrics — uptime, response time, issue recurrence — and report progress on reliability improvements
Requirements:
- Lead and mentor engineers across Network and Android (Elo) systems — building a strong culture of ownership and reliability
- Drive GitOps adoption for network and device configuration, ensuring deployments are consistent, testable, and reversible
- Oversee MDM and device lifecycle management (Elo tablets, Android handhelds), ensuring clean provisioning and policy enforcement
- Run the operational loop: stay close to client incidents, analyze recurring issues, and drive root-cause elimination through system changes, automation, and better visibility
- Collaborate with Core Services (Device Registry, MDM, Sidecar) and NOC to improve observability, alerting, and response workflows
- Standardize configurations and rollout models (base + overlays) to eliminate variance across restaurant networks
- Design for resilience: enable cellular failover, LTE monitoring, and automated recovery patterns through controllers
- Own service quality metrics — uptime, response time, issue recurrence — and report progress on reliability improvements