W2 Role
Title: Site Reliability Engineering or DevOps
Location: Charlotte Onsite
Duration: Long Term
Job Type: Contract
JOB DESCRIPTION:
Core SRE Skills
- 5+ years of experience in Site Reliability Engineering or DevOps roles
- Strong understanding of SRE principles (SLIs, SLOs, error budgets, toil reduction)
- Experience with incident management and post-mortem processes
- Proven ability to design for reliability, scalability, and performance
Observability & Monitoring
- Implement and maintain comprehensive monitoring using Dynatrace, CloudWatch, LogRocket, and X-Ray
- Leverage AI/ML capabilities in observability tools to detect anomalies and predict potential issues
- Set up intelligent alerts and dashboards for critical Deposits functionality
- Create Power BI reports for operational metrics, SLI/SLO tracking, and executive visibility
Incident Management
- Assist in triage during production incidents and outages
- Respond to incidents and participate in post-mortem analysis
Automation & Infrastructure
- Automate platform and infrastructure provisioning using Terraform
- Automate operational tasks to reduce manual toil
FinOps & Cost Optimization
- Manage AWS cost optimization for Deposits application stack
- Implement cost monitoring and alerting mechanisms
Development Skills
- Strong programming skills in Java or Node.js
- REST API and/or SOAP service development experience
- Understanding of microservices and serverless architecture patterns