Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure, and sovereign infrastructure for modern AI, machine learning, and data-intensive applications. The Technical Support Engineer is responsible for maintaining the reliability and performance of customer environments remotely and supporting any Mirantis Opensack/k0s layers they run along with it.
Responsibilities:
- Proactively/Reactively troubleshoot customer environments based on Linux, OpenStack, Kubernetes, networking and other cloud technologies, detect, report and resolve issues. Alert Management
- Learn and troubleshoot MOSK (Mirantis Openstack on Kubernetes)
- Resolve and Handle/Triage IaaS related needs in customer datacenters
- Reproduce customer issues in labs where needed, confirm bug reports, provide detailed information to the development team
- Work closely with the development team: discuss customer issues, suggest improvements, fix product bugs, etc
- Own escalations end-to-end by routing issues to the appropriate teams, including OpenStack/ceph storage, networking, hardware, and infrastructure, and product engineering while maintaining accountability and follow-through
- Participate in weekend on call rotation and holiday coverage
- Communicate urgently, clearly, and in detail with customers during incidents via email and remote session, providing accurate status updates and guiding them through troubleshooting and resolution
- Work with AI tools to increase efficiency and problem solving tasks for customers
- Troubleshooting server issues- DCOPs issues, Firmware upgrading, Iso, OS, Dell, Lenovo, ST Micro
- Work to troubleshoot any DC networking asks/upgrades, triaging where applicable
Requirements:
- High School diploma or equivalent required, four year college degree preferred or work history equivalent (3+ years technical customer support in IaaS, Saas Technologies
- Knowledge of OpenStack, Neutron, kubernetes and object storage principles
- Strong English speaking and writing ability required
- Expert Linux system administration and troubleshooting skills
- DCOPs experience, Firmware upgrades, Iso, OS, Dell, Lenovo, ST Micro
- Networking - tools / Netbox, LibraNMS, Verity, Tailscale, experience with troubleshooting network issues) - Vlans, ports, spine/leaf config, change management, Fortinet, Cisco, Juniper
- Python, scripting experience, automation concepts
- Understanding of networking concepts and protocols
- Alert management experience
- Standard Operating Procedure generation (SOP)
- Change Management experience
- Good knowledge of virtualization solutions (libvirt, KVM, VMWare)
- Good knowledge of network and distributed storage solutions
- Experience w/ databases and message brokers (MySQL, Galera, PostgreSQL, RabbitMQ, InfluxDB, ElasticSearch, Cassandra, Zookeeper)
- Experience with configuring, customizing, and extending monitoring tools (Grafana, Kibana, Nagios, Prometheus)
- Experience working configuration management tools (Puppet, Chef, Salt, Ansible, Helm)
- Kubernetes experience, openstack frameworks
- TOR switching, networking concepts