AWSAzureCloudLinuxAmazon Web ServicesNew RelicCI/CD
About this role
Role Overview
Analyze and optimize existing New Relic dashboards, telemetry, and monitoring setup for the TRAIT application.
Review and refine PagerDuty alert triggers, escalation policies, and incident workflows to ensure only actionable events generate alerts.
Identify obsolete dashboards, alerts, and monitoring components and optimize them based on current operational requirements.
Support DevOps operational activities including monitoring production environments, incident response, root cause analysis, and reliability improvements.
Collaborate with development, infrastructure, and support teams to improve application observability and operational health.
Assist in automation and monitoring integration within CI/CD and cloud environments.
Recommend and implement observability best practices for logging, metrics, tracing, and alerting.
Requirements
Strong hands-on experience with New Relic including APM, telemetry, dashboards, alerting, and observability optimization.
Experience in configuring and managing PagerDuty alerts, on-call workflows, escalation policies, and incident management processes.
Good experience in DevOps/SRE operations including production monitoring, troubleshooting, and operational support.
Experience with cloud and DevOps tools such as Amazon Web Services / Microsoft Azure, CI/CD pipelines, Linux, scripting, or infrastructure monitoring.