Sailplane is an early-stage AI infrastructure startup focused on creating autonomous infrastructure for enterprise computing. The Principal Engineer will lead the development of intelligent agents that manage AI data centers, focusing on hands-on coding and systems thinking while collaborating with the team to prioritize projects based on customer needs and technical implementation.
Responsibilities:
- Design and evolve the control plane for agents: planning and execution loops, workflows, callbacks, and state models
- Implement sandboxing, dry-run/preview modes, invariants, approvals, and rollback strategies so agents can safely change real infrastructure and applications
- Take hierarchical planning / agent research prototypes and build production-grade agents, services and APIs around them (auth, rate limits, quotas, retries)
- Anticipate production system needs (security, networking, SLOs/SLAs) and instrument for reliability and usage via observability (logs, traces, guardrails)
- Guide architecture trade-offs and reason about IP boundaries as an engineering design dimension
- Build, deploy, monitor, and operate LLMs in production on-premises in diverse customer environments and implement MLOps best practices (CI/CD pipelines, containerization, continuous monitoring) to ensure reliable performance
- Lead through influence, setting engineering standards for code quality, testing, documentation, and on-call practices
- Partner with founders to prioritize what to build next based on customer needs, market demand and technical implementation
- Conduct interviews and evaluate candidates across engineering, product, and design roles, helping build Sailplane's early team