Investigate, diagnose, and resolve complex production incidents across distributed systems
Perform in-depth technical root cause analysis (RCA) for customer-reported issues
Develop, maintain, and improve internal tools and services using Python
Collaborate with support engineers to reproduce, analyze, and prioritize escalated issues
Work directly with the operations team to support production workloads running on Linux-based systems
Analyze application logs, system metrics, and networking-level behaviors to pinpoint issues
Implement code fixes, performance improvements, and reliability enhancements
Participate in an on-call rotation for critical issue escalation (if applicable)
Document findings, create knowledge base entries, and automate repetitive troubleshooting steps
Contribute to CI/CD workflows, monitoring improvements, and infrastructure quality initiatives
Must learn and support the entire infrastructure, not just records management

3+ years of professional experience with Python development
Moderate programming experience required, preferably in Python, with ability to write clean, maintainable code and develop internal tools for diagnostics and automation
Strong hands-on experience with Linux systems
Shell scripting, system diagnostics, process management, file system debugging
Deep Unix troubleshooting expertise required, supporting multiple clients and modern deployments
Solid understanding of networking fundamentals TCP/UDP, DNS, HTTP/S, routing, firewalls, packet capture, troubleshooting tools (tcpdump, ss, traceroute)
Needs to understand networking, handle complex upgrades, and absorb application-level infrastructure knowledge
Experience investigating production issues in real-world environments
Log analysis, observability tools, incident response processes
Strong technical troubleshooting skills, ideally with a background as a developer who moved into Unix/DevOps
Ability to read and interpret large codebases and technical logs to identify root causes
Familiarity with version control (Git) and DevOps pipelines
Comfortable with high-pressure situations, emergency calls, and weekend work

Python Support Engineer

Key skills