BMC Software is an innovative company focused on delivering AI-native solutions for IT organizations. They are seeking a Lead DevOps Engineer to drive the design, automation, and operations of large-scale data platform services, emphasizing event streaming and search analytics.
Responsibilities:
- Architect, deploy, and operate highly available Kafka and OpenSearch clusters for large‑scale streaming, search, and observability workloads
- Define patterns for scaling, data retention, replication, and disaster recovery across hybrid and distributed environments
- Optimize performance, reliability, and cost efficiency for multi‑tenant data platforms
- Lead adoption of Terraform, Ansible, Jenkins, and GitOps workflows to automate provisioning, configuration, patching, and upgrades
- Implement automated schema management, connector deployments, and data ingestion pipelines for Kafka topics and OpenSearch indices
- Design CI/CD workflows that integrate with streaming and search services
- Deploy and manage Kafka, OpenSearch, and supporting services on Kubernetes using Helm charts and operators
- Ensure resilient orchestration of stateful workloads and efficient resource utilization
- Build observability frameworks for Kafka brokers, topics, consumers, and OpenSearch nodes, including metrics, logging, and alerting
- Partner with SRE and platform teams to meet 99.99% uptime SLAs and proactively mitigate issues
- Enforce RBAC, encryption, auditing, and compliance controls for sensitive data streams and search indices
- Automate vulnerability patching, certificate management, and security hardening across platforms
- Drive initiatives that reduce operational overhead, improve reliability, and optimize financial efficiency
- Mentor engineers on best practices for distributed data systems and platform engineering
- Align DevOps strategies with business goals around performance, uptime, and cost
Requirements:
- Bachelor's degree (or equivalent experience) in Computer Science, Engineering, Information Systems, or related field
- 10+ years of DevOps, Platform, or SRE experience with distributed data services
- Hands-on experience with Terraform, Ansible, Kubernetes, and CI/CD automation
- Strong background in database DevOps, data operations, or cloud database automation
- Deep knowledge of Kafka, OpenSearch, and large-scale data platforms
- Advanced knowledge of Terraform, Ansible, CI/CD tooling, Kubernetes, and database automation
- Advanced proficiency with IaC, GitOps, and CI/CD workflows
- Skilled in multi-cloud and on-premises integrations
- Proficiency with Python, Bash, and automation frameworks
- Strong cross-functional leadership with global teams
- Anticipates and resolves scaling, reliability, and cost challenges
- Experience supporting PostgreSQL, Oracle, SQL Server, and/or modern data platforms
- Skilled in building, optimizing, and automating data pipelines for large-scale operations
- Aligns DevOps automation with business vision and cost savings
- Works closely with DBAs, Data Engineers, and cross-functional stakeholders
- Clearly explains complex technical solutions to both technical and non-technical audiences
- Anticipates and mitigates risks to maintain database reliability and performance
- Stays ahead of evolving data technologies and prepares the team for adoption
- Provides technical mentorship in DevOps and database automation best practices
- Kafka certifications (Confluent or equivalent)
- Cloud: AWS/Azure/GCP/OCI DevOps or Database certifications
- IaC & Automation: HashiCorp Terraform Professional, Ansible Automation certifications
- Containerization: Docker Certified Associate, Certified Kubernetes Administrator (CKA)
- DevOps: DevOps Institute certifications (DevOps Foundation, DevSecOps Engineering)