Continuously improve the reliability and performance of ClickHouse core.
Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers.
Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements.
Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers.
Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities.
Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact.
Requirements
Bachelor’s or Master’s degree in Computer Science or a related field.
At least 5 years of experience in Reliability Engineering, QA or customer facing engineering.
Previous experience operating ClickHouse or other SQL databases in production.
Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus.
Scripting experience with Shell or Python,and ability to read and understand C++ code.
Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
You are a strong problem-solver and have solid production debugging skills.
You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward.
You have a high level of responsibility, ownership, and accountability.
Excellent communication skills.
Tech Stack
AWS
Azure
Cloud
Google Cloud Platform
Python
SQL
Benefits
Flexible work environment
ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
Healthcare
Employer contributions towards your healthcare.
Equity in the company
Every new team member who joins our company receives stock options.
Time off
Flexible time off in the US, generous entitlement in other countries.
A $500 Home office setup if you’re a remote employee.
Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.