Monitor and Manage Kubernetes Clusters: Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes
Kubernetes Management: Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance
Containerization & Deployment: Design and maintain Docker-based microservices architecture, ensuring consistent and reproducible deployments across staging, QA, and production environments
Cloud Infrastructure Management: Work with leading Cloud Platforms (AWS, Azure and/or GCP) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.)
Monitoring & Incident Response: Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins or Kubernetes clusters
Automate Infrastructure Processes: Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, Linux, or equivalent
Collaborate Across Teams: Work closely with development, services, and operations teams to ensure a seamless integration between application development, deployment, and infrastructure
Security & Compliance: Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning
Requirements
Active TOP SECRET clearance or higher is required
Bachelor’s degree (or equivalent) in computer science or related discipline
A minimum of two (2) years of experience working with on-premise and off-premise cloud environments
Experience with AWS and/or Azure
Hands-on experience with a range of open-source technologies, such as Linux, Docker, Kubernetes, K8s, Terraform, Helm, PostgreSQL, or similar technologies
Ability to program (structured and OOP) using one OR more high-level languages, such as Python, Java, C/C++, Ruby, and JavaScript
Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn)
Proactive approach to identifying problems, performance bottlenecks, and areas for improvement
Ability to lead and work independently in an Agile/Scrum environment
Real passion for developing team-oriented solutions to complex engineering problems
Ability to thrive in an autonomous, empowering and exciting environment
Great verbal and written communication skills to collaborate multi-functionally and improve scalability
The ability to work on multiple concurrent projects is essential. Strong self-motivation and the ability to work with minimal supervision
Must be a team-oriented individual, energetic, result & delivery oriented, with a keen interest in quality and the ability to meet deadlines
Must have demonstrated experience serving in, or supporting, a federal agency that requires a Top Secret clearance
Tech Stack
Ansible
Apache
AWS
Azure
Cloud
Docker
Google Cloud Platform
HDFS
Java
JavaScript
Jenkins
Kubernetes
Linux
Microservices
NFS
Postgres
Python
Ruby
Terraform
Yarn
Benefits
Competitive compensation packages reflecting value