Lead and manage daily cloud operations to ensure high availability, performance, reliability, and scalability of cloud infrastructure and services.
Direct incident response, root cause analysis, and problem resolution efforts to minimize service disruption and drive continuous improvement.
Establish, document, and enforce cloud operational policies, procedures, and best practices aligned with organizational and regulatory requirements.
Partner with security, risk, and compliance teams to ensure cloud environments meet security standards, CSS Audits, regulatory obligations, and internal control requirement.
Monitor cloud usage, optimize resource utilization, and manage budgets to ensure cost-effective and efficient cloud operations.
Drive automation, monitoring, and tooling improvements to enhance operational efficiency, observability, and system resilience.
Collaborate with operations, engineering, application development, and business teams to support cloud initiatives, changes, and production deployments while mentoring and developing cloud operations staff.
Requirements
Bachelor’s degree in information technology, Computer Science, Engineering, or a related field; Master’s degree preferred.
6 + years of progressive IT infrastructure or cloud operations experience, with at least 3 years in a leadership or management role.
Proven track record leading incident response, on-call operations, and operational maturity initiatives in a 24/7 environment.
Hands-on experience with cloud cost management, monitoring, automation, and infrastructure-as-code tools.