Enable job alerts via email!

Site Reliability Engineer - Core C++ Team

ClickHouse

Canada

Remote

CAD 90,000 - 130,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading database company, ClickHouse, is seeking a Site Reliability Engineer for its Core C++ Team. This role focuses on improving the reliability, availability, and performance of ClickHouse, requiring significant expertise in SQL databases and engineering practices. Join a global team committed to delivering top-notch services and optimize ClickHouse in a rapid growth environment. Enjoy a flexible, remote-friendly workplace with excellent benefits including healthcare, equity options, and a home office setup.

Benefits

Flexible work environment
Employer contributions towards healthcare
Equity in the company
Flexible time off
$500 Home office setup for remote employees
Global Gatherings

Qualifications

  • 5+ years of experience in Reliability Engineering, QA or customer facing engineering.
  • Previous experience with ClickHouse or SQL databases in production is a major plus.
  • Thrive in a fast-paced environment as a part of a global team.

Responsibilities

  • Continuously improve the reliability and performance of ClickHouse core.
  • Create metrics and alerts to prevent production problems.
  • Enhance incident response processes and manage on-call processes.

Skills

Problem-solving
Production debugging
Understanding of distributed database internals
Scripting in Shell or Python

Education

Bachelor’s or Master’s degree in Computer Science or related field

Tools

AWS
Azure
Google Cloud Platform

Job description

Site Reliability Engineer - Core C++ Team
About ClickHouse

Established in 2009, ClickHouse leads the industry with its open-source column-oriented database system, driven by the vision of becoming the fastest OLAP database globally. The company empowers users to generate real-time analytical reports through SQL queries, emphasizing speed in managing escalating data volumes. Enterprises globally, including Lyft, Sony, IBM, GitLab, Twilio, HubSpot, and many more, rely on ClickHouse Cloud. It is available through open-source or on AWS, GCP, Azure, and Alibaba.

Note: This position can be based remotely in any country ClickHouse has a hiring presence.

We are committed to providing our customers with reliable and secure services at ClickHouse. To continue this, we are building out our Site Reliability Engineering team in ClickHouse Core. As one of the first members of our Reliability Engineering Team at Core, you will be responsible for building and leading processes to ensure and improve the reliability, availability, scalability, and performance of ClickHouse. You will collaborate with different teams like Control Plane, Dataplane,Security, Support and Operations and guide them to implement ClickHouse in the best way for our customers. You will also own the areas of managing engineering escalation management and response, investigations, post-mortem analysis including running blameless postmortems, and continuous improvement of how Clickhouse is run and optimized in the cloud. This role is a unique opportunity to make a significant impact on our elastic, limitless scale, high-performance ClickHouse in ClickHouse Cloud.

What will you do?

  • Continuously improve the reliability and performance of ClickHouse core.
  • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers.
  • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements.
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers.
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities.
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact.

About you:

  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • At least 5 years of experience in Reliability Engineering, QA or customer facing engineering.
  • Previous experience operating ClickHouse or other SQL databases in production.
  • Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus.
  • Scripting experience with Shell or Python,and ability to read and understand C++ code.
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
  • You are a strong problem-solver and have solid production debugging skills.
  • You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward.
  • You have a high level of responsibility, ownership, and accountability.
Compensation

For roles based in theUnited States, you can find above our typical starting salary ranges for this role, depending on your specific location.

The positioning of offers within a certain range depends on various factors, including: candidate experience, qualifications, skills, business requirements and geographical location.

  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
  • Healthcare - Employer contributions towards your healthcare.
  • Equity in the company - Every new team member who joins our company receives stock options.
  • Time off - Flexible time off in the US, generous entitlement in other countries.
  • A $500 Home office setup if you’re a remote employee.
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.

Culture - We All Shape It

As part of our first 500 employees, you will be instrumental in shaping our culture.

Are you interested in finding out more about our culture? Learn more about our values here . Check out our blog posts or follow us on LinkedIn to find out more about what’s happening at ClickHouse.

Equal Opportunity & Privacy

ClickHouse provides equal employment opportunities to all employees and applicants and prohibits discrimination and harassment of any type based on factors such as race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Please seehere for our Privacy Statement.

Apply for this job

*

indicates a required field

First Name *

Last Name *

Email *

Phone

Resume/CV *

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Website or Github

What is your current location? *

Will you require sponsorship from ClickHouse for your right to work in your current location? * Select...

Do you have reliability engineering experience related to ClickHouse or another SQL database in production? * Select...

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Engineer II, Backend (Consumer Authentication)

Affirm

Victoria

Remote

CAD 125,000 - 175,000

5 days ago
Be an early applicant

Software Engineer II, Backend (Consumer Authentication)

Affirm

Regina

Remote

CAD 125,000 - 175,000

5 days ago
Be an early applicant

Software Engineer III, Backend - Falcon (Remote, CAN)

CrowdStrike

Winnipeg

Remote

CAD 110,000 - 180,000

3 days ago
Be an early applicant

Senior Core Infrastructure Engineer - Platforms Orchestration

Kraken Digital Asset Exchange

Remote

CAD 100,000 - 203,000

7 days ago
Be an early applicant

Senior Software Developer, Telephony

CallMiner

Montreal

Remote

CAD 120,000 - 150,000

5 days ago
Be an early applicant

Senior Software Developer, Telephony

CallMiner

Montreal

Remote

CAD 120,000 - 150,000

5 days ago
Be an early applicant

Open Source Networking Software Engineer - ToR Switch / SmartNIC / DPU

Canonical

Regina

Remote

CAD 80,000 - 120,000

5 days ago
Be an early applicant

Software Engineer II, Backend (Consumer Authentication)

Affirm

Kitchener

Remote

CAD 125,000 - 175,000

5 days ago
Be an early applicant

Senior Software Developer, Telephony

CallMiner

Ottawa

Remote

CAD 100,000 - 140,000

5 days ago
Be an early applicant