Astronomer is a company empowering data teams to bring software, analytics, and AI to life through their unified DataOps platform, Astro. They are seeking a Senior Software Engineer to join their Platform Engineering team, focusing on designing, building, testing, and deploying production infrastructure that enhances data pipeline orchestration for global organizations.
Responsibilities:
- Make high-quality, data-driven and experience-driven decisions on how we build this and the next generation of our production platform, then deliver the results
- Own and build how we test, build and deploy code in a high-scale PaaS environment
- Collaborate across the whole company on how we design production systems, set standards and make technology choices for new and existing products, and how these fit together
- Deliver results - we routinely “change the wheels on the bus while it’s moving”, in a predictable, safe and reliable way
- Be at the forefront of how we work together as a Platform Engineering team
- Blaze a Trail: Work on a small but growing team on building out the Platform/Reliability practice for the company – this role reports directly to the VP of Reliability
- Be an Owner: Be directly involved in decision-making on what we work on, as well as how we work on it. Make promises, and keep them
- Do Sensible Things: Be directly involved in determining how our platform works. Participate in incident management and determine sensible practices as the platform evolves
- Garage Door Open: Create and maintain comprehensive internal documentation for systems and processes, ensuring clarity and accessibility
Requirements:
- Strong experience in Non-Abstract Systems design and implementation
- Strong proficiency in Python, Golang and in-depth experience with Kubernetes (CKA or equivalent or greater)
- Experience with observability principles and technologies, including SLI/SLO definition and tracking
- Strong communication skills, both written and verbal, with experience in working with a globally distributed team in delivery
- A passion for reliability and operational excellence. A low tolerance for toil and other nonsense
- Ability to estimate the scope of work accurately and coordinate with stakeholders to address risks and ensure successful project delivery
- Experience with (and ideally strong opinions on) software development best practices, such as code review, testing, CI/CD, version control, automation and debugging
- Proactive approach to identifying and addressing issues, with a focus on ownership and accountability
- Experience working on a SaaS/PaaS product across multiple cloud providers
- Experience with our particular tech stack components and technologies (deep breath): CircleCI, Chronosphere (Prometheus), Splunk, Bazel, Istio, Playwright, Karpenter, Github Actions …
- Experience of the innards and quirks of AWS, GCP and (particularly) Azure
- Participated in an on-call rotation - this role involves periodic on-call for the services we own
- Experience with Apache Airflow