Enable job alerts via email!

Senior Site Reliability Engineer

Jobgether

Remote

GBP 82,000 - 129,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading tech recruitment service is seeking a highly skilled Senior Site Reliability Engineer in the UK. The role involves managing large-scale data systems, optimizing performance, and driving automation in a dynamic, remote-first environment. Candidates should have experience with cloud-native systems, Kubernetes, and distributed platforms. This position offers a competitive salary, flexibility, and opportunities for global collaboration, creating a significant impact in the field of data infrastructure.

Benefits

Competitive salary

Fully remote work

Paid time off

Health coverage

Professional development stipends

Qualifications

5+ years of experience managing production systems at scale.
Strong understanding of distributed systems and data platforms.
Ability to work independently in a remote-first team.

Responsibilities

Operate and maintain large-scale data systems ensuring stability.
Design and optimize deployment processes with containerization.
Collaborate with engineering teams to support projects.

Skills

5+ years of experience in SRE, DevOps, operations, or software engineering

Proficiency with scripting (Python, Go, Ruby)

Experience with configuration management tools (Puppet, Ansible, Terraform)

Strong understanding of distributed systems

Excellent English communication skills

Customer-oriented mindset

Experience with Kubernetes

Overview

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in UK.

We are seeking a highly skilled Senior Site Reliability Engineer to help design, operate, and scale data infrastructure and distributed systems. In this role, you will manage critical systems supporting large-scale data workloads, optimize performance, and drive automation across operations. You will collaborate with cross-functional teams, mentor peers, and proactively identify and resolve reliability challenges. The position offers opportunities to shape operational practices, improve service stability, and ensure seamless performance for users globally. Ideal candidates are experienced in cloud-native systems, Kubernetes, and large-scale data platforms, with a passion for improving reliability and operational efficiency in a dynamic, remote-first environment.

Accountabilities

Operate and maintain large-scale data systems and services, ensuring stability, scalability, and performance.
Design, implement, and optimize deployment processes, leveraging virtualization and containerization.
Monitor system health, analyze failures, and proactively identify sources of instability in complex distributed systems.
Automate operational tasks, streamline workflows, and identify gaps in processes to improve efficiency.
Collaborate with engineering and data teams to support their projects, remove roadblocks, and improve productivity.
Mentor and provide guidance to peers in technical and operational best practices.
Participate in global team collaboration asynchronously, and attend team gatherings or conferences as needed.

Requirements

5+ years of experience in SRE, DevOps, operations, or software engineering roles managing production systems at scale.
Proficiency with scripting and programming languages commonly used in SRE contexts (Python, Go, Ruby, etc.).
Experience with configuration management and orchestration tools such as Puppet, Ansible, or Terraform.
Strong understanding of distributed systems, data platforms, and virtualization of data and compute.
Ability to work independently and effectively as part of a globally distributed, remote-first team.
Excellent English communication skills, both written and verbal.
Customer-oriented mindset, with a focus on supporting users and communities.
Bonus: Experience with Kubernetes, Ceph, and operating large-scale data platforms.

Benefits

Competitive U.S.-based salary range: $113,082 – $175,725 per year, adjusted for skills, experience, and location.
Fully remote work with flexibility and autonomy.
Paid time off and sabbatical opportunities.
Health coverage and reimbursement options.
Professional development and home office stipends.
Opportunities for global collaboration and travel to team events.
A chance to contribute to large-scale, high-impact data infrastructure and distributed systems.

Why Apply Through Jobgether?

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role\'s core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Data Privacy Notice

By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

End of description.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions