Role: Deployment Engineer & Coordinator
Location: Texas (Remote)
Type: Contract
Engineering – Build & Run (70%)
- IaC with Terraform: Author, modularize, and version Terraform modules for Azure AI services (e.g., Azure OpenAI, Azure AI Search, Azure ML, Key Vault, Storage, Event Hub, Container Registry), networking (VNets, Subnets, NSGs, Private Endpoints), AKS clusters/namespaces, and enterprise controls (Policies, RBAC).
- Pipelines & GitOps: Create Azure DevOps YAML pipelines (and/or GitHub Actions if used) for plan/apply, linting, security scans, policy checks, environment promotions, and change approvals; enable declarative app configs for AKS (Helm/Manifests/Flux/Argo). (Aligned to our Azure DevOps + GitOps reference patterns.)
- Kubernetes Platform Enablement: Engineer shared AKS add ons (Ingress, Cert Manager, External DNS, CSI Secrets, Dapr/Sidecars as applicable); implement namespace isolation, network policies, HPA, and autoscaling for AI services/agents. (Supports our AI asset deployment model via dockerization + automated Terraform provisioning.)
- Security & Entra ID Integration: Implement Entra ID app registrations, service principals/managed identities, workload identities for AKS, role assignments, and key rotation; apply Azure Policy, Defender for Cloud, and private only data paths.
- Observability & Reliability: Wire up Application Insights/Log Analytics, platform level SLOs, and alerts; integrate with model/endpoint monitoring for AI workloads (drift, schema/feature checks) where applicable.
- Platform Templates & Golden Paths: Maintain reusable “golden” Terraform stacks, pipeline templates, and AKS baselines so onboarding teams can self serve via standardized deployment patterns.
Coordination – Kanban, Intake & Risk (30%)
- Kanban Flow: Own the Kanban board for platform onboarding (backlog hygiene, WIP limits, service classes), run standups/flow reviews, and make work visible for stakeholders in Engineering, Security, and App Teams.
- Onboarding Coordination: Facilitate intake for new apps, clarify prerequisites (networking, identity, data boundaries), and align sequencing of Terraform updates, pipeline changes, and security reviews across teams.
- Risk/Issue Tracking: Proactively identify blockers (policy violations, quota limits, identity gaps, secrets management) and drive mitigations; escalate where needed with crisp, written updates.
- Standards & Enablement: Publish short enablement guides, walkthroughs, and checklists so product teams can adopt platform controls quickly and consistently.
Minimum Qualifications
- 5–7 years of hands on experience in Python, Terraform, Azure, and DevOps/Pipelines (Azure DevOps or GitHub Actions).
- Proven experience deploying Azure infrastructure with Terraform (modules, workspaces, state management, policy as code) and running AKS workloads in regulated or enterprise environments. Strong understanding of Entra ID (app registrations, SPNs/managed identities, RBAC, workload identity for AKS) and secrets management (Key Vault, CSI driver).
- Solid grasp of networking & security in Azure (Private Endpoints, Firewall, NSGs, routing, TLS/certs, image governance).
- Demonstrated ability to coordinate Kanban flow, manage intake, and drive cross team alignment with clear communication.