Palo Alto Networks is dedicated to protecting the digital way of life through innovative technology. They are seeking a Principal Engineer Software for their Chronosphere team to build developer tooling that enhances developer velocity and reliability across the engineering organization.
Responsibilities:
- Architect & Build: Design and maintain high-scale developer tooling and backend services that improve productivity and reliability across a distributed cloud environment. We operate in a 100% modern, cloud-native ecosystem, and you will work exclusively with ephemeral infrastructure and containerized microservices
- Infrastructure as Code (IaC): Treat infrastructure as a first-class citizen. You will define, deploy, and manage entire environments using declarative IaC (Terraform), ensuring our platform is reproducible and version-controlled
- Drive Systemic Quality: Identify and eliminate systemic bottlenecks in the software development lifecycle (SDLC) through architectural changes or advanced tooling
- Scale & Reliability: Ensure our infrastructure remains resilient under massive traffic loads while optimizing for performance, cost-efficiency, and near-real-time telemetry processing
- Strategic Leadership & Mentorship: Define platform standards and reference architectures that span a 1–3 year horizon, balancing feature velocity with long-term technical debt. Act as the 'glue' across teams, consulting on infrastructure best practices and up-leveling the organization through mentorship
Requirements:
- 8+ years of relevant experience with the following:
- Strong experience in at least one backend language (e.g., Go, Java, Python, or Rust). We value fluency and the ability to write modular, testable code over knowing a specific syntax
- Strong proficiency in at least one backend language (e.g., Go, Java, Python, or Rust). We prioritize the ability to write modular, testable code over knowledge of a specific language syntax
- Deep knowledge of Linux internals, process management, and resource isolation
- Understanding of the OSI model, service meshes, load balancing, and 'zero-trust' security architectures
- Experience building and debugging systems that deal with CAP theorem trade-offs, eventual consistency, and distributed tracing
- A track record of completing assigned tasks/tickets reliably and estimating work effectively within a sprint. You take ownership of features from local development through to basic testing and delivery
- The ability to debug your own code efficiently using logs and tests, while proactively identifying edge cases (like nulls or limits) during the design phase
- Strong communication skills to keep teammates informed, raise blockers early, and contribute meaningfully to code reviews and design discussions
- A proactive approach to learning new tools, processes, and libraries. You are open to feedback and use incidents or code reviews as opportunities to up-level your skills
- Experience or interest in using AI coding assistants (like Cursor or Claude) to improve productivity and automate boilerplate tasks