Synopsys Inc is a leader in chip design and technology innovation, seeking a Principal Site Reliability Engineer to enhance the reliability and performance of their engineering environment. The role involves applying SRE practices, collaborating across teams, and transforming processes into scalable solutions.

Responsibilities:

Applying SRE practices to identify, monitor, communicate, and resolve issues in the environment, while also collaborating with internal teams and customers on post-mortem analysis to deliver root cause insights
Following up on issues reported and looking for procedures to prevent similar occurrences
Reviewing current processes and transforming them into scalable solutions
Debugging OS and engineering issues within our provided Linux environment
Collaborating on internal projects across different time zones and teams
Following up with customers and handing over tasks/issues with team members to utilize time zones efficiently

Requirements:

10+ years of SRE processes and related skills required
Capability to understand complex engineering implementations and their inter dependencies for troubleshooting
Deep Knowledge with Linux distributions (CentOS, RedHat, Ubuntu, SuSE)
Deep Knowledge of virtualization and containerization technologies
Extensive knowledge of storage solutions, including network storage and associated protocols
Good Experience in network technologies
Good Experience in load sharing facilities such as LSF, Slurm and various workload scheduling technologies
Good interpersonal, communication and leadership skills

Principal Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: