TixTrack is a company that creates superior ticketing solutions for performing arts and cultural institutions. The Platform Engineer will be responsible for leading the design, scalability, and operational excellence of the company’s platform infrastructure and core services, ensuring high availability and reliability in a multi-tenant SaaS environment.
Responsibilities:
- Lead the design, implementation, and evolution of reliable, scalable platform infrastructure supporting the company’s SaaS products and APIs
- Partner with engineering teams to evolve platform architecture over time, supporting safe system decomposition, service ownership, and scalable service boundaries
- Architect and maintain infrastructure-as-code frameworks and automation standards to ensure consistency and repeatability across environments
- Establish and refine monitoring, alerting, logging, and observability practices to proactively detect and resolve system issues
- Own and improve incident response frameworks, root cause analysis processes, and post-incident reviews, driving systemic improvements
- Define and guide service level objectives (SLOs), service level indicators (SLIs), and error budgets to support reliability-driven decision-making
- Partner with Engineering teams to embed reliability, operability, and scalability considerations into system architecture and deployment workflows
- Lead improvements to CI/CD pipelines, deployment automation, and platform tooling to increase developer velocity and reduce operational friction
- Drive capacity planning, performance optimization, and scaling strategies to ensure platform readiness for growth and peak usage
- Collaborate with Security and Engineering leadership to ensure platform resilience, data protection, and compliance best practices are embedded in infrastructure design
- Provide technical leadership and mentorship to engineers across teams, influencing platform standards and architectural best practices
- Contribute to documentation, operational standards, and runbooks to ensure consistency and knowledge sharing
- Participate in on-call rotations and provide production support as needed to maintain system uptime and reliability
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
- 8+ years of experience in Site Reliability Engineering, Platform Engineering, DevOps, Infrastructure Engineering, or related roles within a SaaS environment
- Demonstrated experience designing and supporting highly available, production-grade cloud infrastructure
- Deep experience with cloud platforms (Azure preferred), networking, and distributed systems architecture
- Strong experience with infrastructure as code, automation frameworks, and CI/CD pipeline design
- Experience supporting multi-tenant SaaS systems with high availability, uptime, and performance requirements
- Experience with containerization and orchestration technologies (e.g., Kubernetes)
- Experience leading complex technical initiatives or serving as a senior technical advisor across engineering teams
- Strong systems-thinking and architectural judgment, with the ability to design scalable, resilient solutions
- Advanced troubleshooting and root cause analysis skills across distributed systems
- Ability to influence technical direction without direct authority
- Strong written and verbal communication skills, including the ability to document architecture decisions and operational processes
- Ability to balance long-term platform strategy with short-term business priorities
- High level of ownership, accountability, and attention to detail
- Ability to work independently while collaborating effectively across Engineering, Product, and Cybersecurity teams
- Commitment to operational excellence and continuous improvement
- Experience with cloud platforms (Azure preferred)