System One is a leader in delivering outsourced services and workforce solutions across North America. They are seeking a Mid-Level DevOps Software Analyst responsible for mission-critical support of applications and infrastructure, ensuring proper monitoring, scale, and resiliency across environments from test to production.
Responsibilities:
- Design, build, and maintain Azure landing zones and platform services (e.g., VNet, Private Endpoints, Key Vault, Azure Firewall/NSGs, Application Gateway/WAF)
- Implement Infrastructure as Code (IaC) with Terraform and/or Bicep; enforce GitOps workflows (branching, PRs, policy checks)
- Create reusable modules, pipelines, and golden patterns for app teams; champion automation-first approaches
- Define and measure SLIs/SLOs, error budgets, and reliability roadmaps for critical services
- Implement and tune observability (logs, metrics, traces) using Azure Monitor, Log Analytics, Application Insights, and Prometheus/Grafana where applicable
- Conduct capacity planning, resiliency testing (chaos, failover, DR), and performance tuning across services
- Build secure, robust CI/CD pipelines (GitHub Actions / Azure DevOps Pipelines) with automated testing, scans, and approvals
- Standardize deployment strategies (blue/green, canary, rolling) for containerized and PaaS workloads
- Manage container platforms (AKS: node pools, cluster autoscaling, HPA/VPA, ingress, network policies) and registries (ACR)
- Implement guardrails using Azure Policy, RBAC, PIM, and Blueprints (or equivalent) to enforce least privilege and compliance (e.g., SOC 2, ISO 27001, HIPAA as relevant)
- Manage secrets and certificates (Key Vault) and integrate security testing (SAST/DAST/Container scanning) into pipelines
- Support vulnerability remediation and patching SLAs
- Own incident response, including rotational shifts and on-call; lead triage, root cause analysis (RCA), and post-incident reviews
- Optimize cost (FinOps), tagging standards, budgets, and proactive spending alerts
- Maintain runbooks, knowledge base articles, and automation for routine operations
- Act as a technical mentor; review designs/PRs; contribute to architecture decisions
- Partner with app teams to onboard workloads, define nonfunctional requirements, and drive platform adoption
- Manage and or Participate in the deployment and release of development, test and production software builds
- Manage the operations and monitoring of applications and infrastructure from dev to production
- Develop code and escalate break fix issues that may occur
- Task automation of infrastructure and application provisioning
- Ensure all environments meet scale and resiliency requirements
- Support customer facing and internal applications
- Monitor submitted tickets; assign, escalate and communicate, as required
- Participate in services and software systems design
- Participate in rotating on-call support duties
Requirements:
- 5 to 7 years of hands-on experience with Azure-based infrastructure and services in production
- Bachelor's degree in computer science (or related) or equivalent work experience required
- 2+ years of experience supporting enterprise level applications and their infrastructure
- Strong understanding of web and their related infrastructure technologies (load balancers, DNS, IIS or Apache/WebSphere, authentication and authorization, database connections, etc.)
- Experience implementing or managing Application Monitoring, Splunk, or Dynatrace
- Working knowledge in Automation technologies
- Familiarity with Agile/Scrum methodologies
- Deep expertise in several of: AKS, App Services, Functions, APIM, Azure SQL/MI, Cosmos DB, Storage, Event Hub/Service Bus, Redis, VNet/Peering, Private Link, Application Gateway/WAF, Front Door
- Strong IaC with Terraform (preferred) and/or Bicep; Git-based workflows; GitHub or Azure DevOps
- Proven SRE background: SLI/SLO design, error budgets, incident management, RCA, capacity and performance engineering
- CI/CD design and operations (GitHub Actions / Azure DevOps Pipelines); artifact/versioning strategies; release governance
- Observability with Azure: Monitor, Log Analytics, Application Insights, and alerting/automations (Action Groups, Logic Apps, Functions)
- Solid networking fundamentals (DNS, TLS, routing, firewalls, load balancing), identity (AAD/Entra ID), and secrets management (Key Vault)
- Scripting proficiency in PowerShell and/or Python; Linux fundamentals
- Understanding of security best practices: RBAC, PIM, Azure Policy, managed ide
- Good written, verbal, interpersonal and presentation skills. Ability to communicate among technical and non-technical employees, and process orientation skills
- Demonstrates a customer driven approach and good relationship management skills
- Working knowledge of mainframe architecture is a plus
- Ability to work autonomously and under deadlines
- Ability to multi-task, be highly organized, and work independently
- Ability to identify areas of improvement and come up with creative solutions
- At least 2 years coding/scripting JAVA, JavaScript and HTML