Johnson Technology Systems, Inc. (JTSi) is a federal government consulting firm providing technical services to various agencies. They are seeking a Senior SRE/DevOps Operations Engineer to ensure high uptime and quality of service, while managing cloud services and collaborating with global team members.
Responsibilities:
- Provision, monitor and operate cloud services in a globally distributed team
- Analyze and solve operational issues and respond to incidents
- Exposure to working with appropriate complex systems administration, database administration and managing landscape maintenance, upgrades and hotfixes
- Maintaining the integrity and security of servers and systems
- Exposure to developing and operating monitoring policies and standards
- Ensure proper resource allocation related to the use of computing resources across cloud environments
- Conduct incident root cause analysis and implement continuous improvements
- Partner with product development team to design and enhance service reliability
- Exposure in developing and implementing testing strategies and documenting results
- Work in a diverse environment and cross-train with other global team members
- Willingness to Support On-call rotation schedule
- Flexible schedule which may include weekend or after-hours work
Requirements:
- MUST be a US Citizen and ONLY hold US citizenship (No dual citizens)
- Expertise with GIT
- Expertise with Concourse including setup, management and troubleshooting of new pipelines
- Expertise with Linux specifically SUSE and Ubuntu
- Expertise with Kafka, Zookeeper and BigData technologies
- Expert in development of automation for testing, deployment, scalability and management cloud services
- Expertise with building, implementing, and/or supporting cloud monitoring tools
- Expert knowledge of Cloud Computing and Databases
- Expert understanding of web services, networking, virtualization, and internet protocols
- Ability to multitask and handle various projects, deadlines and changing priorities
- Excellent communication and prioritization skills
- Expertise with security fundamentals as they pertain to SaaS Multitenant Application systems
- Strong interpersonal, presentation and customer service skills
- Participation in an on-call rotation for handling P1 incidents is required
- Experience with observability tools such as Prometheus and Grafana
- 8+ years of experience
- Experience with AWS Route 53, EC2, S3, CloudWatch, DynamoDB, RDS, IAM, ACM, KMS, VPC
- Experience with Cloud Foundry based environments
- Experience with Jenkins and/or Chef automation and Terraform
- Expert with Kubernetes, troubleshooting, operations, management and configuration of complex Kubernetes services
- Exposure to and understanding of troubleshooting IP networks and application stacks
- BS/BA degree in Computer Science, Management Information Systems, or related IT discipline preferred
- ALLOWABLE SUBSTITUTION: An additional four (4) years of experience can be substituted for a BS or BA degree