Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. They are seeking a Staff Software Engineer with a platform infrastructure/Site Reliability Engineering background to work on infrastructure automation, integration, and operations.
Responsibilities:
- Hands-on software engineering to push infrastructure and operational excellence further
- Effective collaboration with end-users, peers, domain experts, and stakeholders
- Leadership to grow teams’ capabilities and eagerness to learn more
Requirements:
- Fluent in Python, Infrastructure-as-Code (Ansible), shell scripting, Linux SysOps, and CI/CD
- DevOps mindset with experience in software integrations and operational infrastructure
- Experienced in observability, including hardware, system, and application level telemetry, monitoring, and alerting (Prometheus, Loki, Alloy, Grafana, Sentry, SNMP, Redfish, IPMI)
- Familiarity with Bare Metal, Virtual Machine and Kubernetes provisioning and operations
- Hands-on software engineering to push infrastructure and operational excellence further
- Effective collaboration with end-users, peers, domain experts, and stakeholders
- Leadership to grow teams' capabilities and eagerness to learn more
- Neocloud / CSP background is a plus