Empower Pharmacy is a company focused on driving enterprise infrastructure reliability and security. The Staff DevOps & Site Reliability Engineer will be responsible for enhancing system uptime, product quality, and operational efficiency across hybrid and multi-cloud environments, while leading architecture and automation initiatives.
Responsibilities:
- Design and operate scalable hybrid and multi-cloud infrastructure across Azure, AWS, and on-prem environments, ensuring high availability, resilience, and cost efficiency while leveraging AI-driven insights to continuously optimize system performance, resource allocation, and architectural decisions aligned with enterprise growth and compliance requirements
- Build and maintain Infrastructure as Code frameworks using Terraform, Bicep, or similar tools, enabling consistent, auditable deployments while integrating AI-assisted automation to accelerate provisioning, reduce errors, and enhance infrastructure lifecycle management across complex distributed environments
- Engineer secure, high-performance networking solutions including hybrid connectivity, segmentation, VPNs, and zero trust architectures, utilizing AI-enhanced analytics to proactively detect vulnerabilities, optimize traffic flows, and ensure secure, compliant communication across all infrastructure layers
- Establish and evolve SRE practices including SLIs, SLOs, and error budgets, leveraging AI-driven observability platforms to improve system reliability, automate incident detection, and enable proactive remediation that minimizes downtime and enhances service quality at scale
- Lead incident response, root cause analysis, and post-incident improvements, applying AI-powered anomaly detection and predictive analytics to reduce mean time to resolution, prevent recurrence, and strengthen operational resilience across mission-critical systems
- Drive intelligent capacity forecasting and performance optimization using AI models, ensuring infrastructure scales efficiently with demand while maintaining cost discipline, system reliability, and alignment with business growth objectives
- Design and implement AI-driven operational capabilities (AIOps) including predictive monitoring, anomaly detection, and automated remediation, transforming traditional operations into intelligent, self-healing systems that improve speed, accuracy, and scalability of infrastructure management
- Build AI-ready infrastructure platforms that support advanced analytics, automation pipelines, and enterprise AI workloads, ensuring secure, scalable environments that enable innovation while maintaining strict compliance with regulatory standards
- Leverage AI and data-driven insights to inform infrastructure strategy, optimize performance, and enhance operational decision-making, enabling faster, more accurate responses to evolving system demands and business priorities
Requirements:
- Bachelor's degree in Information Systems, Computer Science, Engineering, or related field required
- 8–10 years of hands-on experience in infrastructure engineering, DevOps, or Site Reliability Engineering roles
- Proven experience designing, implementing, and operating hybrid infrastructure solutions across on-premises and cloud environments
- Hands-on experience designing and operating solutions across both Microsoft Azure and AWS in production environments
- Strong hands-on expertise in Microsoft Active Directory (Hybrid AD), Microsoft Entra ID (Azure AD), and Group Policy
- Experience working in regulated environments with knowledge of HIPAA, SOC 2 Type II, and/or HITRUST compliance requirements
- Experience operating infrastructure remotely using secure access methods, including VDI (Virtual Desktop Infrastructure), bastion hosts, or privileged access workstations
- Strong problem-solving, documentation, and communication skills, with the ability to influence technical direction and drive engineering best practices across teams
- Master's degree in Information Systems, Computer Science, Engineering, or related field
- Azure Solutions Architect Expert (AZ-305) or AWS Solutions Architect Azure (Associate/Professional)
- Administrator (AZ-104) or AWS SysOps Administrator
- Azure DevOps Engineer Expert (AZ-400) or AWS DevOps Engineer Professional
- HashiCorp Terraform Associate
- CISSP or CISM
- Microsoft Identity and Access Administrator (SC-300) or Azure Security Engineer (AZ-500)