Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. They are seeking a Lead Principal Engineer who provides advanced technical leadership for global Network & Security operations, ensuring the stability, security, and performance of enterprise systems while mentoring engineers and driving continuous improvements.
Responsibilities:
- Serve as top-level escalation for complex network/security incidents
- Lead major incident resolution, RCA, and post-incident improvements
- Ensure high availability, performance, and security of global platforms
- Provide deep expertise in routing/switching, LAN/WAN, SD-WAN, wireless, data center networking, firewalls, VPNs, IDS/IPS, ISE, Zero Trust, cloud networking, load balancing, DNS, and DDoS protection
- Support hybrid cloud/on-prem connectivity
- Review and validate network designs, configurations, and implementation plans
- Support 24x7 follow-the-sun operations and after-hours escalations
- Drive monitoring optimization, automation, and reduced MTTR
- Collaborate with SRE and platform teams on reliability initiatives
- Ensure readiness for upgrades, maintenance, and deployments
- Maintain runbooks, SOPs, diagrams, and documentation
- Implement automation and IaC (Terraform, Ansible, CloudFormation)
- Support ITSM processes (change, incident, problem)
- Coach mid-level engineers across global shifts
- Act as technical authority and promote best practices
- Work with Architecture/Security/Engineering teams on roadmaps and tech adoption
- Ensure compliance with security policies and audit requirements
- Communicate effectively with leadership, vendors, and partners
Requirements:
- 8+ years in enterprise Network/Security engineering
- Experience in 24x7 global operations
- Strong hands-on skills in networking, firewalls, VPNs, cloud
- Proven incident response leadership
- Ability to make strong technical decisions under pressure
- Ready to work on contract
- Certifications (CCIE, cloud networking, etc.)
- Automation/scripting experience (Python, Ansible, Terraform)
- ITIL exposure and work in regulated environments
- Experience with global, multi-time-zone teams
- Technical leadership
- Incident/crisis management
- Clear communication
- Analytical problem solving
- Focus on continuous improvement