Own GovWell’s deployment and reliability platform, ensuring systems are production-safe, observable, and resilient as customer and engineering load grows.
Build and evolve CI/CD foundations that improve deployment speed, safety, and repeatability across multiple teams shipping independently.
Establish and maintain observability systems, including metrics, logging, tracing, alerting, and on-call foundations.
Lead incident response practices, including playbooks, escalation paths, postmortems, and continuous reliability improvements.
Own platform-level security and compliance foundations, including SOC 2 readiness, vulnerability management, and access controls.
Improve infrastructure efficiency and cost performance in partnership with engineering leadership.
Build internal tooling that enables product engineers to focus on feature development rather than infrastructure overhead.
Requirements
5+ years of experience as a backend, infrastructure, or platform engineer operating production SaaS systems at scale.
Strong systems engineer with deep understanding of reliability, distributed systems, and operational failure modes.
Comfortable owning CI/CD, observability, incident response, and production operations end-to-end.
Advanced user of modern AI tools (Cursor, Copilot, GPT-based workflows).
Strong coder (Node/TypeScript or similar); you build systems, not just configure tools.
High-ownership, startup-oriented operator comfortable owning a broad platform surface area.
Clear written and oral communicator who can document systems, write RFCs, and lead postmortems.