Enable job alerts via email!

Senior Site Reliability Engineer

Leap29

Wokingham

Hybrid

GBP 100,000 - 125,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology services company in Wokingham is searching for a Senior Site Reliability Engineer (SRE) to lead initiatives in maintaining the performance and reliability of key platforms. The ideal candidate will have extensive experience with cloud environments, particularly OpenShift, and will play a pivotal role in automating processes and improving incident management. This position emphasizes collaboration across teams to ensure engineering stability and operational excellence. Competitive hourly rates apply depending on experience.

Qualifications

Hands-on experience with Azure, AWS, OpenShift, Kubernetes, and Docker.
Experience designing scalable infrastructure and automating processes.
Strong communication skills for cross-team collaboration.

Responsibilities

Ensure high availability and performance of critical systems across multiple platforms.
Design and implement robust observability systems for proactive issue resolution.
Mentor engineering teams on best practices for deployment and reliability.

Skills

Cloud & Containers

CI / CD & Automation

Observability

Languages & Scripting

Networking

Databases

OS & Systems

Incident Management

Collaboration

Education

5+ years of experience in SRE, DevOps, or production engineering roles

Tools

Terraform

Azure DevOps

GitHub Actions

Jenkins

OpenShift

Senior Site Reliability Engineer (SRE)

Location : Wokingham (2 days / week onsite)

Type : Inside IR35

Rate : Up to £70.00 per hour (DOE)

We’re looking for a Senior Site Reliability Engineer (SRE) to lead efforts in maintaining the reliability, performance, and scalability of mission-critical platforms and services. This role is ideal for someone who thrives at the intersection of software engineering, infrastructure, automation, and incident response.

You’ll be instrumental in defining and implementing the standards and systems that keep applications running smoothly across cloud and hybrid environments—including OpenShift clusters.

What You’ll Be Responsible For

As a Senior SRE, you will :

Ensure high availability, performance, and latency of critical systems across Azure, AWS, and OpenShift.
Design and implement robust observability systems (logging, monitoring, alerting) to detect and resolve issues proactively.
Lead and evolve incident management processes—runbooks, comms, postmortems, and root cause analysis.
Define and monitor SLIs, SLOs, and error budgets to balance innovation with stability.
Automate manual processes through infrastructure-as-code, scripting, and modern CI / CD pipelines.
Mentor engineering teams on best practices for deployment, reliability, scalability, and incident preparedness.
Support and scale OpenShift-based containerized applications, including upgrade strategies, patching, and workload optimization.

Core Responsibilities

Operations & Incident Management

Act as the senior escalation point for outages and critical incidents.
Lead post-incident reviews and implement long-term remediation plans.
Communicate platform health and risk posture to stakeholders at all levels.

Engineering & Automation

Build and improve CI / CD pipelines using tools like Azure DevOps, GitHub Actions, Jenkins, and GitLab.
Design scalable, fault-tolerant infrastructure with IaC tools (Terraform, Bicep).
Create internal tools and automation to accelerate development and reduce operational toil.

Strategic & Advisory

Architect cloud and container infrastructure, with a focus on OpenShift, Kubernetes, and hybrid deployments.
Collaborate with engineering, architecture, and security teams to embed reliability into the SDLC.
Promote advanced deployment strategies (blue-green, canary, rolling updates) and rollback readiness.
Drive a culture of reliability, observability, and operational excellence across engineering teams.

Technical Environment

Hands-on experience with many of the following is expected :

Cloud & Containers : Azure, AWS, OpenShift , Kubernetes, Docker, App Services, IaaS (EC2, VMs)
CI / CD & Automation : Terraform, Bicep, Azure DevOps, Jenkins, GitHub Actions, GitLab
Observability : Prometheus, Grafana, Datadog, ELK, Splunk, Application Insights, CloudWatch
Languages & Scripting : Python, C#, Bash, PowerShell
Networking : DNS, SSL / TLS, load balancing, WAF, proxies, CDN, Azure App Gateway
Databases : MSSQL, PostgreSQL, MongoDB, CosmosDB, DynamoDB
OS & Systems : Windows, Linux, Nginx, IIS

Ideal Candidate Profile

5+ years of experience in SRE, DevOps, or production engineering roles.
Expertise operating in high-availability, fast-paced production environments.
Solid engineering foundation with experience reading and writing production code.
Hands-on experience deploying, supporting, and scaling OpenShift environments.
Proven track record of leading incident responses and improving system reliability.
Strong collaboration and mentoring abilities across infrastructure, development, and security teams.

What You’ll Bring

Ability to balance operational risk with engineering velocity.
Strong communication skills across technical and non-technical audiences.
A passion for automating everything and eliminating manual work.
A mindset of ownership, continuous improvement, and technical leadership.

Ready to make reliability your legacy?

If you’re a senior SRE with OpenShift experience and a drive to solve complex operational challenges, we’d love to hear from you.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs