Job Search and Career Advice Platform

Enable job alerts via email!

Lead Cloud Site Reliability Engineer

lloyds banking group

Leeds

On-site

GBP 92,000 - 110,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading financial organization in the UK is looking for a Lead Site Reliability Engineer to join their Public Cloud Platform team. The ideal candidate will lead an SRE team, ensuring service reliability and performance while collaborating with various engineering teams. Essential experience includes SRE practices within Azure or GCP, a strong understanding of SLIs and SLOs, and hands-on incident management. This hybrid position offers competitive salary, flexible working options and various benefits.

Benefits

28 days holiday plus bank holidays
Generous pension contribution
Private medical insurance
Flexible benefits

Qualifications

  • Proven experience applying SRE practices within Azure or GCP.
  • Strong understanding of SLIs, SLOs, and error budgets.
  • Hands-on experience in incident and problem management.
  • Background in software or cloud engineering.
  • Practical experience with DevOps and automation.

Responsibilities

  • Lead and develop a high-performing SRE team.
  • Embed reliability into roadmaps and delivery decisions.
  • Drive improvements in observability across services.
  • Collaborate with engineering teams to improve service operability.

Skills

SRE practices in Azure

Tools

Dynatrace
Terraform
Jenkins
Job description

End Date Tuesday 10 February 2026

Salary Range £92,701 - £109,060

Flexible Working Options Hybrid Working, Job Share

Job Description Summary

We support flexible working – click here for more information on flexible working options

Job Description

Lead Site Reliability Engineer – Public Cloud Platform Location: Halifax, Leeds or Manchester

Salary: £90,440- £106,400

Working Pattern: Hybrid (2 days in office per week)

About the Opportunity

At Lloyds Banking Group, our purpose is to Help Britain Prosper. As we continue redefining into a modern, innovative, purposeful organisation, we’re investing heavily in cloud, automation and engineering excellence across our platforms.

We’re looking for a Lead Site Reliability Engineer (SRE) to join our Public Cloud Platform, supporting both GCP and Azure. In this role you’ll help strengthen observability, reliability, and operational excellence across our cloud estate—enabling our ambition to become the UK’s leading FinTech.

You’ll work closely with Product Owners and Engineering Leads to embed SRE principles, lead a team of up to 15 SREs, and champion a culture of learning, automation and continuous improvement.

What You’ll Be Doing
  • Lead, coach and develop a high‑performing SRE team, fostering autonomy, inclusion and continuous improvement.
  • Partner with Product Owners and Engineering Leads to embed reliability into roadmaps, backlogs and delivery decisions.
  • Apply SRE principles (SLIs, SLOs, error budgets) to ensure our services remain highly reliable, performant and scalable.
  • Drive improvements in observability—across metrics, logs, traces and events—ensuring services are observable by design.
  • Use Dynatrace as the primary observability platform for significant dashboards and customer‑centric alerting.
  • Own Infrastructure‑as‑Code and CI/CD‑based environments, implementing enhancements and responding to operational change.
  • Lead coordination of incident response and root cause analysis, supporting teams through major incidents, post‑incident reviews and prevention of recurrence.
  • Collaborate with multi‑disciplinary engineering teams to remove technical impediments, reduce toil and improve service operability.
  • Contribute hands‑on engineering where needed, validating technical decisions and guiding best practice.
  • Bring an approach of curiosity, experimentation, and first‑principles thinking to evolve our engineering culture.
What You’ll Bring
Essential Skills & Experience
  • Proven experience applying SRE practices within Azure, GCP, or both.
  • Strong understanding of SLIs, SLOs, error budgets, and how to use these to guide product and engineering decisions.
  • Experience ensuring reliability of production services, including availability, performance and recoverability.
  • Hands‑on or leadership experience in incident and problem management, focused on reducing MTTR and avoiding repeat issues.
  • Background in software engineering or cloud engineering, with good understanding of modern SDLC practices.
  • Practical experience with DevOps, CI/CD and automation to improve service reliability.
  • Experience improving observability on complex, distributed systems.
  • Ability to use data to influence prioritisation and balance reliability with feature delivery.
  • Collaboration and communication skills, working effectively with product, engineering and platform teams.
  • Experience mentoring engineers and promoting inclusive, supportive team culture.
Desirable Skills
  • Certifications or strong experience with Google Cloud Platform and/or Microsoft Azure.
  • Knowledge of Kubernetes, compute services, API management and large‑scale distributed systems.
  • Experience with Terraform, Jenkins, or equivalent configuration/pipeline tooling.
  • Ability to write and maintain scripts or code in languages such as Python, Bash, PowerShell or Groovy.
  • Solid grasp of cloud networking, security, and systems built around APIs.
  • Experience with Infrastructure as Code, modular design and scalable automation patterns.
About You

You’re someone who:

  • Is passionate about building resilient, observable, customer‑focused platforms.
  • Enjoys coaching others, sharing knowledge and shaping engineering culture.
  • Looks for opportunities to remove toil and introduce automation.
  • Thrives in collaborative, multi‑functional environments.
  • Adopts new tools, technologies and modern engineering approaches.
  • Values diverse perspectives, psychological safety and inclusive ways of working.
What You’ll Get in Return

We’re committed to building a truly inclusive workplace where everyone can grow, thrive and make a meaningful impact. As part of LBG, you’ll also receive:

  • A competitive salary and performance‑related bonus
  • 28 days holiday plus bank holidays
  • Generous pension contribution
  • Private medical insurance
  • Flexible benefits to suit your lifestyle
  • Hybrid working model and family‑friendly policies
  • Access to wellbeing support, training and career development
Inclusion and Diversity

We’re committed to building an inclusive environment where everyone can be themselves and thrive. We value diversity of thought, background and experience, and we actively encourage applications from all communities. If you need reasonable adjustments during the recruitment process, please let us know.

Additional Information

At Lloyds Banking Group, we're driven by a clear purpose; to help Britain prosper. Across the Group, our colleagues are focused on making a difference to customers, businesses and communities. With us you'll have a key role to play in shaping the financial services of the future, whilst the scale and reach of our Group means you'll have many opportunities to learn, grow and develop.

We keep your data safe. So, we’ll only ever ask you to provide confidential or sensitive information once you have formally been invited to an interview or accepted a verbal offer to join us which is when we run our background checks. We'll always explain what we need and why, with any request coming from a trusted Lloyds Banking Group person.

We're focused on creating a values‑led culture and are committed to building a workforce which reflects the diversity of the customers and communities we serve. Together we’re building a truly inclusive workplace where all of our colleagues have the opportunity to make a real difference.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.