Drive end-to-end capacity planning and performance engineering across infrastructure and application layers
Provide hands-on technical leadership and escalation support across OS, database, and application stacks
Identify capacity risks, performance bottlenecks, and system saturation trends, ensuring proactive mitigation
Drive cost optimization (FinOps) through right-sizing and efficient workload placement
Lead root cause analysis and prevention strategies for recurring incidents
Partner with Architecture, SRE, DevOps, and Application teams to influence design and scalability decisions
Build and lead a high-performing team, ensuring technical depth and execution excellence
Communicate risks, insights, and recommendations to executive leadership
Requirements
Experience across infrastructure, capacity management, performance engineering, or SRE, with 3+ years management experience
Proven capability as a hands-on technical leader across multiple stacks
Deep expertise in: Windows and Linux operating systems, Database platforms (Oracle, SQL Server/SQL, MongoDB etc), JVM performance analysis and tuning, Distributed and large-scale systems