Job Search and Career Advice Platform

Enable job alerts via email!

Senior Site Reliability Engineer

Leap29

Wokingham

Hybrid

GBP 100,000 - 125,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology services company in Wokingham is searching for a Senior Site Reliability Engineer (SRE) to lead initiatives in maintaining the performance and reliability of key platforms. The ideal candidate will have extensive experience with cloud environments, particularly OpenShift, and will play a pivotal role in automating processes and improving incident management. This position emphasizes collaboration across teams to ensure engineering stability and operational excellence. Competitive hourly rates apply depending on experience.

Qualifications

  • Hands-on experience with Azure, AWS, OpenShift, Kubernetes, and Docker.
  • Experience designing scalable infrastructure and automating processes.
  • Strong communication skills for cross-team collaboration.

Responsibilities

  • Ensure high availability and performance of critical systems across multiple platforms.
  • Design and implement robust observability systems for proactive issue resolution.
  • Mentor engineering teams on best practices for deployment and reliability.

Skills

Cloud & Containers
CI / CD & Automation
Observability
Languages & Scripting
Networking
Databases
OS & Systems
Incident Management
Collaboration

Education

5+ years of experience in SRE, DevOps, or production engineering roles

Tools

Terraform
Azure DevOps
GitHub Actions
Jenkins
OpenShift
Job description

Senior Site Reliability Engineer (SRE)

Location : Wokingham (2 days / week onsite)

Type : Inside IR35

Rate : Up to £70.00 per hour (DOE)

We’re looking for a Senior Site Reliability Engineer (SRE) to lead efforts in maintaining the reliability, performance, and scalability of mission-critical platforms and services. This role is ideal for someone who thrives at the intersection of software engineering, infrastructure, automation, and incident response.

You’ll be instrumental in defining and implementing the standards and systems that keep applications running smoothly across cloud and hybrid environments—including OpenShift clusters.

What You’ll Be Responsible For

As a Senior SRE, you will :

  • Ensure high availability, performance, and latency of critical systems across Azure, AWS, and OpenShift.
  • Design and implement robust observability systems (logging, monitoring, alerting) to detect and resolve issues proactively.
  • Lead and evolve incident management processes—runbooks, comms, postmortems, and root cause analysis.
  • Define and monitor SLIs, SLOs, and error budgets to balance innovation with stability.
  • Automate manual processes through infrastructure-as-code, scripting, and modern CI / CD pipelines.
  • Mentor engineering teams on best practices for deployment, reliability, scalability, and incident preparedness.
  • Support and scale OpenShift-based containerized applications, including upgrade strategies, patching, and workload optimization.
Core Responsibilities
Operations & Incident Management
  • Act as the senior escalation point for outages and critical incidents.
  • Lead post-incident reviews and implement long-term remediation plans.
  • Communicate platform health and risk posture to stakeholders at all levels.
Engineering & Automation
  • Build and improve CI / CD pipelines using tools like Azure DevOps, GitHub Actions, Jenkins, and GitLab.
  • Design scalable, fault-tolerant infrastructure with IaC tools (Terraform, Bicep).
  • Create internal tools and automation to accelerate development and reduce operational toil.
Strategic & Advisory
  • Architect cloud and container infrastructure, with a focus on OpenShift, Kubernetes, and hybrid deployments.
  • Collaborate with engineering, architecture, and security teams to embed reliability into the SDLC.
  • Promote advanced deployment strategies (blue-green, canary, rolling updates) and rollback readiness.
  • Drive a culture of reliability, observability, and operational excellence across engineering teams.
Technical Environment

Hands-on experience with many of the following is expected :

  • Cloud & Containers : Azure, AWS, OpenShift , Kubernetes, Docker, App Services, IaaS (EC2, VMs)
  • CI / CD & Automation : Terraform, Bicep, Azure DevOps, Jenkins, GitHub Actions, GitLab
  • Observability : Prometheus, Grafana, Datadog, ELK, Splunk, Application Insights, CloudWatch
  • Languages & Scripting : Python, C#, Bash, PowerShell
  • Networking : DNS, SSL / TLS, load balancing, WAF, proxies, CDN, Azure App Gateway
  • Databases : MSSQL, PostgreSQL, MongoDB, CosmosDB, DynamoDB
  • OS & Systems : Windows, Linux, Nginx, IIS
Ideal Candidate Profile
  • 5+ years of experience in SRE, DevOps, or production engineering roles.
  • Expertise operating in high-availability, fast-paced production environments.
  • Solid engineering foundation with experience reading and writing production code.
  • Hands-on experience deploying, supporting, and scaling OpenShift environments.
  • Proven track record of leading incident responses and improving system reliability.
  • Strong collaboration and mentoring abilities across infrastructure, development, and security teams.
What You’ll Bring
  • Ability to balance operational risk with engineering velocity.
  • Strong communication skills across technical and non-technical audiences.
  • A passion for automating everything and eliminating manual work.
  • A mindset of ownership, continuous improvement, and technical leadership.
Ready to make reliability your legacy?

If you’re a senior SRE with OpenShift experience and a drive to solve complex operational challenges, we’d love to hear from you.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.