Shyft6 is seeking a Senior Production Support Engineer to support and maintain AI-driven applications, data platforms, and client-facing solutions in a production environment. This role is responsible for ensuring system stability, performance, and reliability across various cloud and data ecosystems.
Responsibilities:
- Provide L2/L3 production support for applications, data pipelines, and AI-driven solutions
- Monitor system performance and respond to incidents, alerts, and service disruptions
- Perform root cause analysis (RCA) and implement fixes or coordinate with engineering teams
- Support data pipelines (ETL/ELT) and ensure accuracy of data feeding into reporting tools (Tableau, Power BI)
- Troubleshoot and resolve issues related to API integrations and microservices
- Support CRM integrations (DealCloud) and related data workflows
- Maintain and improve monitoring, logging, and alerting systems
- Execute runbooks and standard operating procedures (SOPs) for issue resolution
- Collaborate with development, QA, and data teams to ensure smooth deployment and production readiness
- Participate in on-call rotations and provide after-hours support as needed
- Identify opportunities for automation and process improvement within support operations
Requirements:
- 5+ years of experience in Production Support, Application Support, or Site Reliability Engineering (SRE)
- Strong experience supporting systems in AWS and/or Azure environments
- Experience troubleshooting data pipelines, ETL/ELT processes, and data-related issues
- Strong SQL skills for data investigation and validation
- Experience with monitoring and observability tools (e.g., Datadog, Splunk, New Relic, CloudWatch, Azure Monitor)
- Experience with API troubleshooting and microservices-based architectures
- Familiarity with incident management and ticketing systems (e.g., ServiceNow, Jira)
- Basic scripting or programming experience (e.g., Python, Bash, or PowerShell)