Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. They are seeking a skilled Senior DevOps Engineer to architect, build, automate, and maintain enterprise-grade cloud infrastructure and CI/CD platforms to support mission-critical applications.
Responsibilities:
- Design, build, and continuously refine scalable cloud infrastructure using AWS, Azure, or Google Cloud Platform, ensuring environments are highly available, fault tolerant, secure, cost-optimized, and aligned with the organization's cloud architecture strategy and operational standards
- Author secure, reusable, and well-documented Infrastructure as Code (IaC) using Terraform, AWS CloudFormation, or Azure Resource Manager templates, following infrastructure best practices, governance policies, and security standards while enabling consistent environment provisioning across multiple stages
- Develop, implement, and maintain enterprise CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, Azure DevOps, or equivalent platforms to automate code integration, testing, security scanning, deployment, rollback, and release management processes
- Design and manage containerized application platforms using Docker and Kubernetes, implementing scalable orchestration strategies, workload scheduling, service discovery, ingress management, autoscaling, and high availability across production environments
- Actively participate in infrastructure architecture discussions, cloud migration initiatives, disaster recovery planning, and technology evaluations by providing technical recommendations that balance scalability, resiliency, maintainability, security, and operational efficiency
- Continuously monitor, analyze, and optimize infrastructure performance, cloud resource utilization, application availability, deployment frequency, system reliability, and operational costs using proactive monitoring, alerting, logging, and performance tuning techniques
- Implement and maintain centralized monitoring, logging, observability, and incident management solutions using tools such as Prometheus, Grafana, ELK Stack, CloudWatch, Datadog, Splunk, or New Relic to improve visibility and accelerate issue resolution
- Develop comprehensive automation for infrastructure provisioning, configuration management, patch management, backup strategies, security compliance, vulnerability remediation, and operational tasks using Ansible, Chef, Puppet, Bash, Python, or PowerShell scripting
- Contribute meaningfully to DevSecOps initiatives by integrating automated security scanning, vulnerability assessments, compliance validation, secrets management, identity and access management, and policy enforcement throughout the CI/CD lifecycle
- Proactively identify infrastructure bottlenecks, operational risks, security vulnerabilities, technical debt, and automation opportunities by conducting root cause analysis, capacity planning, architecture reviews, and continuous improvement initiatives
- Collaborate effectively within Agile/Scrum delivery teams by participating in sprint planning, backlog refinement, daily standups, release planning, production deployments, incident response activities, retrospectives, and cross-functional technical discussions to ensure reliable software delivery
- Maintain comprehensive infrastructure documentation including cloud architecture diagrams, deployment procedures, CI/CD workflows, disaster recovery plans, operational runbooks, standard operating procedures, and knowledge base articles to ensure maintainability and operational continuity
Requirements:
- Bachelor's degree in Computer Science, Information Technology, Engineering, or a closely related technical discipline
- Five or more years of professional experience implementing and supporting enterprise DevOps practices, cloud infrastructure, automation, and CI/CD solutions in production environments
- Strong, demonstrable understanding of cloud computing principles, infrastructure architecture, networking, operating systems, virtualization, automation, high availability, disaster recovery, and system security best practices
- Advanced working knowledge of AWS, Azure, or Google Cloud Platform, including compute, networking, storage, identity management, monitoring, and security services used in enterprise cloud environments
- Hands-on production experience designing and managing CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, Azure DevOps, or similar automation platforms supporting enterprise software delivery
- Proven experience implementing Infrastructure as Code using Terraform, AWS CloudFormation, Azure Resource Manager, or equivalent automation frameworks for infrastructure provisioning and lifecycle management
- Strong knowledge of Linux system administration, Docker, Kubernetes, container orchestration, networking concepts, scripting languages (Bash, Python, PowerShell), and configuration management tools such as Ansible, Chef, or Puppet
- Solid experience with Git-based version control, branching strategies, code review processes, release management, DevOps workflows, and Agile software development methodologies
- Hands-on experience implementing monitoring, logging, observability, security, backup, disaster recovery, and cloud governance solutions across enterprise production environments
- Strong troubleshooting, root-cause analysis, communication, documentation, and problem-solving skills with the ability to diagnose complex infrastructure issues, automate repetitive processes, and support mission-critical systems under tight operational deadlines
- Experience designing and supporting event-driven architectures, serverless computing, Kubernetes operators, service mesh technologies, and cloud-native application platforms
- Familiarity with DevSecOps practices, security automation, vulnerability management, policy-as-code, container security, secrets management, and compliance frameworks such as SOC 2, ISO 27001, PCI DSS, or HIPAA
- Exposure to distributed systems, site reliability engineering (SRE), chaos engineering, blue-green deployments, canary releases, infrastructure resilience, and performance engineering methodologies
- Experience implementing automated testing, release automation, cloud cost optimization, platform engineering, self-service infrastructure, and continuous improvement initiatives within enterprise Agile and DevOps environments