Enable job alerts via email!

Senior Site Reliability Engineer

Tria

London

Hybrid

GBP 85,000 - 100,000

Full time

19 days ago

Job summary

A leading technology company in London is seeking a Senior Site Reliability Engineer to enhance the reliability of digital platforms. Candidates should have 5+ years in SRE/DevOps, strong incident response skills, and experience in high-traffic environments. The role offers a competitive salary and a hybrid work model.

Benefits

Car allowance
Bonus

Qualifications

  • 5+ years in SRE/DevOps roles; strong background in incident response.
  • Experience in high-traffic digital or eCommerce platforms.
  • Understanding of SRE practices.

Responsibilities

  • Lead incident management, post-mortems, and blameless RCAs.
  • Build scalable, resilient microservices with dev teams.
  • Improve alerting, monitoring, and system-level metrics.

Skills

Leadership skills
Observability expertise
Incident response
Automation
Infrastructure as code

Tools

Kubernetes
Terraform
AWS
Python
CI/CD tools

Job description

Senior Site Reliability Engineer

Central London (Hybrid)

Up to 100k + Car Allowance & Bonus

TRIA are working with a leading hospitality client to hire a Senior SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms.

This is a hands-on leadership role - you won't just guide others, you'll be the go-to expert when systems are under pressure. You'll lead incident response, own root cause analysis, and solve performance issues like memory leaks, outages, and flaky services.
You will take ownership of the site reliability and drive that as a discipline.

Your focus will include:
  • Leading incident management, post-mortems, and blameless RCAs
  • Building scalable, resilient microservices with the dev teams
  • Uplifting observability
  • Improving alerting, monitoring, and system-level metrics
  • Driving better SLOs, SLIs, and overall uptime


What you'll bring:
  • Experience in high-traffic digital or eCommerce platforms
  • 5+ years in SRE/DevOps roles; strong background in incident response
  • Observability, automation, and infrastructure as code expertise
  • Leadership skills - mentoring others or leading from the front


The stack includes Kubernetes, Terraform, AWS, Python, and modern CI/CD tools, and it's evolving.

If you understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more!
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs