K1X, Inc. is a technology company transforming the K-1 industry into an all-digital experience. They are seeking a Platform & Infrastructure Engineer to build and maintain tools and standards for engineering teams, focusing on Azure infrastructure and Kubernetes.
Responsibilities:
- Design, build, and maintain Azure-based infrastructure, with a primary focus on Azure Kubernetes Service (AKS) for reliability, scalability, and developer experience
- Architect and operate infrastructure to support continuous availability — including zero-downtime deployments, automated rollouts, and the ability to scale capacity up and down in response to predictable demand peaks and quiet periods throughout the year
- Own system reliability and maintenance practices, including patching, upgrades, and configuration management across environments, ensuring infrastructure remains healthy, current, and audit-ready
- Develop and maintain disaster recovery and business continuity plans — including documented runbooks, tested recovery procedures, rollback strategies, and data recovery protocols that can be executed confidently when needed
- Develop and document reusable tools, networking patterns, and infrastructure templates for engineering teams to follow
- Collaborate cross-functionally with engineering teams when infrastructure changes are coming, or when working with them to understand what they need
- Own and improve CI/CD pipelines using GitHub Actions ensuring fast, reliable, and secure delivery of workflows
- Manage infrastructure-as-code using Terraform, enabling repeatable and auditable provisioning across environments
- Implement and maintain observability and monitoring solutions, including Grafana dashboards and alerting, to provide teams with clear visibility into system health
- Manage identity and access using Microsoft Entra ID, applying least-privilege principles across services and teams
- Approach all infrastructure work with a security-first mindset — proactively identifying risks, enforcing compliance patterns, and communicating deviations from standard operating procedures
- Communicate clearly with stakeholders and adjacent teams on infrastructure changes, timelines, and dependencies
- Contribute to the team's knowledge base by creating runbooks, architecture documentation, and onboarding guides
Requirements:
- 4+ years of hands-on experience with Azure infrastructure in a production environment
- Deep experience with Azure Kubernetes Service (AKS) — cluster management, networking, scaling, gitops and day-2 operations
- Strong understanding of cloud networking, including VNets, NSGs, private endpoints, DNS, and ingress/egress patterns
- Experience with infrastructure-as-code — Terraform preferred
- Proficiency with CI/CD tooling, particularly GitHub Actions
- Comfort working in a small, remote team with a high degree of autonomy and ownership
- Strong written and verbal communication skills — able to work cross-functionally, explain technical decisions clearly, and keep stakeholders informed
- Security-conscious approach to infrastructure design and operations
- Eastern or Central time zone required for team collaboration
- AKS Operations w/Gitops — experience using ArgoCD
- Grafana — dashboard creation, alerting and Azure resource monitoring
- GitHub Actions with Blacksmith and Tailscale for secure and performant CI/CD workflows
- Microsoft Entra ID — app registrations, managed identities, conditional access, and RBAC
- Terraform — modules, remote state management, and environment-specific configurations
- Experience building internal developer platforms or platform engineering tooling
- Ability to identify and codify patterns that reduce toil for the broader engineering organization