Enable job alerts via email!

Senior Site Reliability Engineer (SRE)

Leap29

Wokingham

On-site

GBP 100,000 - 125,000

Full time

17 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Start fresh or import an existing resume

Job summary

A leading company is seeking a Senior Site Reliability Engineer to enhance the stability and performance of critical platforms in Wokingham. This pivotal role involves leadership in ensuring system reliability, managing incidents, and mentoring engineering teams while employing modern software and automation practices.

Qualifications

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering.
  • Proven ability to lead incident response and design for resilience.
  • Strong communications skills across disciplines.

Responsibilities

  • Ensure availability and performance of mission-critical systems across cloud environments.
  • Lead post-incident reviews, coordinate root cause analysis.
  • Architect observability solutions to detect failures proactively.

Skills

Python
C#
Bash
PowerShell
Windows internals
Linux internals
Nginx
IIS

Tools

Azure DevOps
GitHub Actions
Jenkins
GitLab
Terraform
Bicep

Job description

Social network you want to login/join with:

Senior Site Reliability Engineer (SRE), Wokingham

col-narrow-left

Client:

Leap29

Location:

Wokingham, United Kingdom

Job Category:

Other

-

EU work permit required:

Yes

col-narrow-right

Job Reference:

08fd37c63578

Job Views:

4

Posted:

29.06.2025

Expiry Date:

13.08.2025

col-wide

Job Description:

Senior Site Reliability Engineer (SRE)

Location: Wokingham (2 days a week onsite)
Type: Inside IR35
Rate: £80.00 an hour DOE

We’re seeking a Senior Site Reliability Engineer to play a key role in the stability, scalability, and performance of critical platforms and applications. This is a leadership-level position suited to individuals who can move seamlessly between code, infrastructure, incident response, and mentoring engineering teams.

You’ll work across systems, tools, and teams to ensure platform reliability and enable continuous improvement in how software is built, released, and operated.

What You’ll Be Responsible For

As a Senior SRE, you’ll lead initiatives that:

  • Ensure availability, latency, and performance of mission-critical systems across cloud and hybrid environments.
  • Architect observability solutions (monitoring, logging, alerting) that detect and prevent failures before they impact users.
  • Own and improve incident response workflows, including runbooks, communications, and root cause analysis.
  • Define and enforce SLIs, SLOs, and error budgets to balance innovation with operational stability.
  • Mentor engineers and advise teams on best practices for scalability, security, deployment, and incident readiness.
  • Automate repetitive work via infrastructure-as-code, CI/CD pipelines, scripts, and custom tooling.
  • Support and lead platform engineering efforts, reliability reviews, and cross-functional reliability programs.

Core Responsibilities

Operations Leadership

  • Act as a senior escalation point for major incidents and production outages.
  • Lead post-incident reviews, coordinate root cause analysis, and drive remediation plans.
  • Communicate platform health, risk, and improvement plans with technical and non-technical stakeholders.
  • Design and build robust CI/CD workflows using tools such as Azure DevOps, GitHub Actions, Jenkins, or GitLab.
  • Lead the design and delivery of resilient, scalable infrastructure using IaC (Terraform, Bicep, etc.).
  • Develop automation and observability tooling that enables fast feedback loops and minimal manual intervention.

Strategic & Advisory

  • Define infrastructure architecture to support fault-tolerant applications.
  • Collaborate with developers, architects, and product teams to embed reliability into the software lifecycle.
  • Support implementation of secure, scalable deployment patterns (e.g., blue-green, canary releases, rollback strategies).
  • Influence reliability culture and DevOps maturity across teams.

Technical Environment

The ideal candidate brings hands-on experience in many of the following areas:

  • Languages & Scripting: Python, C#, Bash, PowerShell
  • OS & Systems: Windows and Linux internals, Nginx, IIS

Ideal Candidate Profile

  • Extensive experience (typically 5+ years) in Site Reliability Engineering, DevOps, or Production Engineering roles.
  • A solid software engineering background with the ability to read, write, and review production-quality code.
  • Proven ability to lead incident response, influence reliability culture, and design for resilience.
  • Experience operating complex systems in fast-paced, high-availability environments.
  • Strong collaborator who can work across development, infrastructure, and security disciplines.
  • Passion for solving operational problems through automation, not repetition.

What You’ll Bring

  • Ability to lead technical decisions while balancing risk and velocity.
  • Strong communication skills across technical and non-technical stakeholders.
  • A mindset of continuous improvement, ownership, and mentorship.
  • Commitment to eliminating toil, improving developer experience, and delivering reliable platforms at scale.

Ready to make reliability your legacy?
We’d love to hear from experienced SREs who can bring stability to change and clarity to complexity.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.