Sapphire Software Solutions Inc is seeking a highly skilled Site Reliability Engineer (SRE) with a strong focus on availability, reliability, and performance. The ideal candidate will monitor and enhance system reliability while effectively managing batch production incidents.
Responsibilities:
- Monitor batch flow to ensure system reliability and stability
- Handle batch production incidents and escalations promptly
- Create and support batch plans for both planned and unplanned outages
- Improve alert quality and reduce noise in monitoring systems
- Provide support for batch jobs in a 24x7 shift model
- Collaborate with onshore and offshore teams to ensure effective communication and coordination
- Participate in incident, problem, and change management processes
- Conduct root cause analysis (RCA) and post-incident reviews
- Support production release and change validation efforts
Requirements:
- Minimum of 8-10 years of proven experience with a strong focus on availability, reliability, and performance in SRE & Production Batch Support
- Minimum of 5 years of experience with Unix commands and shell scripting
- At least 5 years of experience working with Informatica, including the ability to create mappings
- Minimum of 3 years of experience with Google Cloud Platform (GCP) with proficiency in BigQuery, Cloud Spanner, Airflow, and monitoring & logging tools
- Experience in query writing with MS SQL
- Knowledge of stored procedures and batch job support in PL/SQL
- Proficient in query writing with Snowflake
- Experience with at least one scheduling tool (e.g., Control-M, Tidal)
- Familiarity with incident, problem, and change management processes in Operations & ITIL
- Utilize automation and generative AI to minimize manual operational efforts
- Develop scripts and dashboards for monitoring, alert analysis, and reporting
- Strong communication skills to effectively collaborate with onshore and offshore teams
- Ability to thrive in fast-paced, high-pressure environments
- Willingness to work weekends and on-call shifts based on project needs