Perform deployment of software updates on a cloud platform comprised of Linux OS, BIOS/firmware, OpenStack, Kubernetes, Calico, Ceph, Maria DB and other software components
Ensure continual service availability, troubleshooting and resolution of production problems
Provide incident response, management and root cause analysis
Perform proactive development and implementation of monitoring systems
Maintain and continuously improve tools, ad-hoc scripting and automation infrastructure for configuration management, maintenance, testing, auditing, problem remediation and capacity planning
Collaborate with and manage vendors to drive defect resolution and enhancements for internal customers to meet departmental or enterprise business needs, in addition to providing extended support as needed
Conduct routine hardware and software audits of servers to ensure compliance with established standards, policies, and configuration guidelines
Perform the setup, maintenance, and monitoring of backups of supported infrastructure
Develop, promote, and curate standard operating and method of procedures in conjunction with risk assessments to reduce hazards to the network
Perform security compliance, auditing and remediation based on AT&T security policies in collaboration with CSO
Perform Operational Readiness Testing and Operational Acceptance Testing
Provide integration and advisement to tenants
Provide support of systems before launch through collaboration with TechArch, Labs, application teams, and vendors by providing system design/architecture guidance, system review and testing to ensure the adherence to AT&T requirements and fulfillment of AT&T needs
Perform feasibility assessments, creates requirements, manages projects, and integrates and tests technical solutions for software.
Requirements
Requires a Bachelor’s degree, or foreign equivalent degree in Computer Science, Electronics, Information Technology, Engineering or Communications
5 Years of progressive, postbaccalaureate experience in the job offered
5 Years of progressive, postbaccalaureate experience in a related occupation utilizing Linux, KVM, OpenStack, Kubernetes, Containers, Cloud Infrastructure platforms
Monitoring of the server Infrastructure
Troubleshooting Systems Administration of Linux, Cloud, OpenStack and Kubernetes environment
Working on operations and maintenance of OpenStack modules like Keystone, Nova, Neutron, Swift, Cinder, Heat, Glance, Horizon, and Fuel
Implementing and maintaining High Availability, DRS, Fault Tolerance, Scalability, and Reliability
Utilizing CI/CD Pipeline in Jenkins and GitHub
Creating, reviewing, approving and implementing changes in NC server environment
Utilizing Python and shell scripting
Repairing Dell and HP x86-based hardware.
Tech Stack
Cloud
Jenkins
Kubernetes
Linux
OpenStack
Python
Shell Scripting
Swift
Benefits
Medical/Dental/Vision coverage
401(k) plan
Tuition reimbursement program
Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)
Paid Parental Leave
Paid Caregiver Leave
Additional sick leave beyond what state and local law require may be available but is unprotected