ClickHouse is a rapidly growing cloud company recognized on the 2025 Forbes Cloud 100 list. They are seeking an experienced engineer to join their Observability team, responsible for building and operating a telemetry platform that ensures reliability and efficiency for internal monitoring and customer observability features.
Responsibilities:
- Design, build, and operate distributed systems that power observability across ClickHouse Cloud
- Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
- Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
- Build tooling and automation to eliminate repetitive operational work
- Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
- Collaborate with other engineering teams to improve their observability posture
- Contribute to design discussions, architecture reviews, and mentor teammates
Requirements:
- 5+ years building and running production systems at scale
- Proficiency in Golang
- Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
- Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
- Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
- Experience with ClickHouse