Airbnb, founded in 2007, has grown to over 5 million hosts and 2 billion guest arrivals worldwide. As a Senior Staff Operations Engineer in AIOps, you will lead a high-performing team to implement AI-enabled operations solutions, streamline workflows, and ensure operational excellence across BizTech's corporate environment.
Responsibilities:
- Lead and mentor a high-performing team to scale AI-enabled operations model and deliver AIOps solutions
- Own triage and resolution, proactive monitoring across networks, systems, applications, and cloud services via a homegrown observability platform
- Drive process excellence through automation and shift-left programs
- Set the technical bar, model operational excellence, and ensure high-quality, reliable service
- Lead projects across multiple products and platforms, delivering world-class outcomes that create customer and community value while balancing near- and long-term needs
- Own the AIOps vision, strategy, and roadmap, partnering with the in-house Observability platform owner to leverage infrastructure and data for accurate, correlated insights that streamline Operations
- Drive execution by setting priorities, building accountability, and collaborating effectively across teams
- Partner with BizTech engineering teams to improve service efficiency and security
- Lead 1–3 year operations architecture planning to connect production systems and improve compatibility and stability
- Identify and eliminate recurring issues through scalable automation, improving operational performance and productivity
- Lead the development and maintenance of testing and monitoring tooling to ensure automation platforms run reliably
- Be accountable for the quality and reliability of BizTech services, including validating postmortems, driving root-cause analysis, and ensuring corrective actions are implemented
- Lead technical strategy and discussions, partnering with Operations peers and cross-functional BizTech teams to build AIOps and automation solutions
- Stay on top of tasks, engagements, and team interactions—active collaboration is key to success
- Work in sprints, delivering project work across coding, testing, design, documentation, and operational readiness reviews
- Dedicate part of each day to core Operations work, triaging tickets, spotting patterns, and driving scalable fixes that improve efficiency
- Participate in an on-call rotation, leading high-severity incident response as both incident commander and operations engineer
Requirements:
- 15+ years of experience across AIOps, data catalog architecture, product development, and/or Technical Operations infrastructure
- Strong SDLC experience, including infrastructure as code, configuration management, distributed version control, and CI/CD
- Deep expertise in complex enterprise infrastructure, especially cloud (AWS and/or Google), with a focus on AI/automation, data catalog architecture, workflows, and correlation
- Solid understanding of corporate infrastructure and applications to translate into AIOps requirements and integrations
- Proven ability to lead cross-team, cross-org delivery of large-scale, technically complex, ambiguous initiatives that anticipate business needs
- Proficient in Python or Go
- Experience building API integrations and event-driven architectures (e.g., AWS Lambda/SQS)