SS&C Technologies is a leading financial services and healthcare technology company headquartered in Windsor, Connecticut. They are seeking a Site Reliability Engineer to join their Internal Platform Services team, responsible for ensuring the reliability, scalability, and performance of core services that support the engineering ecosystem.

Responsibilities:

Ensure reliability, scalability, and performance of services through SLIs/SLOs, capacity planning, and incident response
Drive automation of infrastructure operations to minimize toil
Develop and support monitoring, alerting, and observability systems to support proactive issue detection
Partner with internal engineering teams to define service-level objectives, improve deployment workflows, and integrate infrastructure with development needs
Contribute to on-call rotations and incident management, helping ensure high availability of services
Drive post-incident reviews and blameless retrospectives to improve reliability
Stay current with emerging technologies and recommend improvements to existing systems and practices

Requirements:

3+ years of experience as an SRE, DevOps Engineer, or Infrastructure Engineer
Solid experience with Kubernetes administration and tooling (e.g., Helm, ArgoCD, Kustomize)
Strong expertise in cloud platforms (e.g., AWS, GCP, or Azure)
Experience managing databases in production environments (e.g., backups, replication, tuning)
Proficiency in programming or scripting (e.g., Go, Python, Bash)
Deep understanding of CI/CD pipelines and infrastructure automation
Familiarity with monitoring/observability tools (e.g., Prometheus, Grafana)
Strong communication skills and ability to collaborate with software engineering teams
Experience in multi-tenant infrastructure environments
Exposure to compliance and security best practices in infrastructure environments

Senior Site Reliability Engineer- Central Platforms

Key skills

About this role

Responsibilities:

Requirements: