Job title: Site Reliability Engineer
Work Location with Zip code: Charlotte, NC 28288 (Onsite)
Minimum years of experience required: 7+ Years
Mandatory Skills: Splunk, Dynatrace, AppDynamics, Thousand Eyes etc
Good to Have Skills: DevOps/Platform/SRE/Build & Release
Key Responsibilities:
- Proactively monitor system health and performance using monitoring and other observability tools
- Manage and resolve high critical incidents end-to-end with minimal downtime
- Collaborate with development and infrastructure teams to ensure reliability and scalability
- Contribute to continuous improvement of system reliability, monitoring coverage, and alerting accuracy
- Drive automation and efficiency in incident response and post-incident reviews Hands-on experience on scripting languages (Shell Script. Python etc.,)
- Support and guide the migration of legacy applications to cloud platforms
- Need to have knowledge on Grafana dashboards
Required Skills & Experience:
- Experience in Site Reliability Engineering including production support roles
- Hands-on expertise in Splunk, Dynatrace, AppDynamics
- Knowledge on ThousandEyes monitoring tool is a plus
- Proven track record of handling critical production issues independently
- Strong understanding of cloud migration strategies, tools and processes
- Ability to work effectively in high-pressure environments and cross-functional teams
- Excellent troubleshooting, communication, and analytical skills