TEKsystems is a leading provider of business and technology services, seeking a highly versatile DevOps Engineer to take charge of cloud engineering and reliability. The role involves making sound technical decisions, leading future team-building efforts, and ensuring scalability and cost optimization across diverse systems.

Responsibilities:

Core focus: DevOps / Cloud Engineering
Must be the sole DevOps engineer initially — someone confident making sound technical decisions for the business
Needs to be strong‑willed: able to push back when the business requests something that isn’t optimal, articulate risks, and propose the right solution
Expected to both engineer and lead (future team-building potential)
Communicate technical issues in business terms to non-technical stakeholders
Challenge developers constructively on reliability, scalability, and release risk
Partner with product teams to enforce reliability standards and guardrails

Requirements:

Core focus: DevOps / Cloud Engineering
Must be the sole DevOps engineer initially — someone confident making sound technical decisions for the business
Needs to be strong‑willed: able to push back when the business requests something that isn't optimal, articulate risks, and propose the right solution
Expected to both engineer and lead (future team-building potential)
AWS (primary cloud)
Lambda + monitoring/observability
Python (building automation, AI-driven reports)
Linux background
CI/CD experience
Terraform (Infrastructure as Code)
Datadog (nice‑to‑have)
Strong IT fundamentals and a broad technical skillset spanning infrastructure, cloud, databases, and automation
Ability to operate in ambiguity
Ability to communicate effectively with both technical and non-technical stakeholders
Proactively drives reliability, scalability, and cost optimization across diverse systems
Site reliability, azure, Automation, Cloud
Azure: Resource groups, networking (VNets, NSGs), AKS, App Services, Functions, Storage, Key Vault, Monitor, Policy, Cost Management
Systems & Networking: Linux/Windows internals, DNS, TLS, routing, load balancing, caching
Datastores: SQL Server, PostgreSQL/MySQL, Cosmos DB, Redis—query performance, indexing, backup/restore, HA/DR patterns
Observability: Metrics, logs, traces; SLOs/Error Budgets; using Azure Monitor/Log Analytics/Grafana/Prometheus
Automation: Infrastructure-as-Code (Bicep/Terraform), CI/CD (GitHub Actions/Azure DevOps), scripting (Python/PowerShell), runbooks
Security & Compliance: Secrets, identity (Azure AD), least-privilege, policy enforcement, vulnerability/TLS hygiene
Ticket triage & backlog hygiene: “Why is this open?”; age/priority/impact; close/noise reduction; define clear exit criteria
Incident management: Rapid diagnosis; comms to business; post-incident reviews that produce durable fixes (not blame)
Capacity & performance: Can we scale back safely? Where do we need headroom? Evidence-based decisions
Change management: Guardrails, pre-flight checks, safe deploys/rollbacks, feature flags
Communicate technical issues in business terms to non-technical stakeholders
Challenge developers constructively on reliability, scalability, and release risk
Partner with product teams to enforce reliability standards and guardrails
Expert Level

DevOps Engineer

Key skills

About this role

Responsibilities:

Requirements: