SourceDirect Talent is partnering with a company in the SaaS space to find a Senior Site Reliability Engineer. In this role, you will be part of the IT Operations group responsible for maintaining all environments that support the SDLC for a high-performing platform.
Responsibilities:
- Build, maintain, and support all environments that host the platform
- Monitor environments for issues, including configuring alerting and implementing self-healing solutions
- Update and maintain documentation and architectural diagrams
- Work closely with developers and testers to troubleshoot application and platform issues
- Design and implement architectural solutions to support new features and resolve system challenges
- Mentor junior-level engineers
Requirements:
- Kubernetes and service mesh management
- Windows and Linux administration
- Experience with cloud service providers (GCP, Azure, AWS, Oracle Cloud)
- Basic Oracle, SQL Server, or other relational database administration
- Scripting experience (PowerShell, Bash)
- Configuration management tools (Jenkins, Helm, Ansible)
- .NET application management
- Ability to be both creative and analytical, with strong organization and attention to detail in problem-solving
- Experience with AI/ML tools for automation, observability, predictive maintenance, and incident management
- Strong verbal and written communication skills
- Strong leadership, project management, and interpersonal skills
- Natural curiosity and self-motivation to continuously learn and stay current with technical skills
- High energy and the ability to manage a mix of short-, medium-, and long-term priorities