N-iX is seeking a Lead DevOps Engineer to architect and manage the lifecycle of stateful workloads within an Azure-based Microservices Framework. The role focuses on ensuring high availability and performance of databases and persistent storage layers while moving towards a fully automated infrastructure.
Responsibilities:
- Automate the end-to-end lifecycle of StatefulSets: provisioning, seamless volume expansion, graceful termination, and automated re-attachment during node failures
- Implement advanced scheduling logic (Pod Topology Spread Constraints, Anti-affinity) to ensure stateful workloads survive zonal outages and maintenance windows
- Optimize Azure Disk (Premium/Ultra) and Azure NetApp Files integration via CSI drivers to minimize IOPS bottlenecks and latency
- Develop and test automated "Snapshot-to-Restore" pipelines. Ensure that the Actual State of data volumes can be recovered to the Goal State in minutes, not hours
- Utilize Terraform to provision the hardened Azure foundation (Disk Encryption Sets, Proximity Placement Groups, and Networking) required for high-performance stateful clusters
Requirements:
- Expert-level understanding of StatefulSet controllers, Persistent Volume Claims (PVCs), and the Container Storage Interface (CSI)
- Deep experience with Azure Kubernetes Service, specifically around persistent storage integration and Azure-specific networking constraints
- Proficient in Go or Python/Bash for writing custom controllers or maintenance hooks (PreStop/PostStart) that ensure data consistency during updates
- Proven track record of managing production databases or distributed systems (e.g., Postgres, ClickHouse, Elasticsearch) on Kubernetes