Enable job alerts via email!

Site Reliability Engineer

JR United Kingdom

Birmingham

On-site

GBP 70,000 - 90,000

Full time

4 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Lead Technical Subject Matter Expert to enhance the performance and resilience of their technology estate. This role requires a blend of hands-on engineering and architectural oversight, focusing on capacity management and observability controls. The ideal candidate will have extensive experience in engineering and a solid grasp of SRE principles. You'll collaborate with various teams to drive scalable solutions and ensure operational readiness.

Qualifications

  • 10+ years in engineering or technical architecture roles.
  • Solid understanding of capacity planning across mixed deployment models.

Responsibilities

  • Lead design and evaluation of capacity management and observability controls.
  • Collaborate with engineering and infrastructure teams to validate designs.

Skills

Engineering
Infrastructure
Technical Architecture
Communication
Performance Monitoring
Operational Controls

Tools

Geneos
Prometheus
Grafana
AppDynamics

Job description

Social network you want to login/join with:

Location: Onsite 3 days a week in Birmingham or Sheffield

Role Overview:

We are seeking a Lead Technical Subject Matter Expert (SME) with strong systems thinking and a solid grasp of SRE principles to drive the technical uplift of capacity and observability controls across our technology estate. This role blends hands-on engineering depth with architectural oversight and focuses on enhancing performance, resilience, and control effectiveness across services and platforms.

The ideal candidate brings both operational sensibility and the ability to drive scalable solutions — aligning technical capabilities with internal control frameworks and regulatory expectations.

Key Responsibilities:

• Lead the design and technical evaluation of capacity management, utilisation monitoring, and observability controls across platforms.

• Apply SRE-aligned practices to identify control gaps, performance risks, and areas for automation.

• Assess existing tooling, data flows and operational practices to identify control gaps and propose remediation strategies.

• Collaborate with engineering, infrastructure, architecture, and risk teams to validate technical designs and implementation plans.

• Define reusable technical patterns and tooling strategies that enhance operational readiness and control sustainability.

• Support roadmap shaping, tooling assessment, and documentation for governance and operational readiness.

Required Skills & Experience:

• 10+ years in engineering, infrastructure, or technical architecture roles in complex technology environments.

• Solid understanding of compute, storage, and network capacity planning across mixed deployment models.

• Familiarity with SRE disciplines such as observability, service-level indicators/objectives (SLIs/SLOs), and automation of operational tasks.

• Demonstrated ability to interpret and apply control requirements in technical design contexts.

• Hands-on experience with performance monitoring, alerting systems, and diagnostic tooling (e.g., Geneos, Prometheus, Grafana, AppDynamics, or similar tools).

• Strong communication skills — able to convey technical concepts to senior stakeholders and control partners.

• Experience in implementing or uplifting operational controls (capacity, performance, availability).

• Exposure to internal risk frameworks or external regulatory requirements (e.g., DORA, EBA, PRA).

• Background in service reliability, system diagnostics, or incident response.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer

Stratospherec Limited

Greater London

Remote

GBP 70,000 - 85,000

Today
Be an early applicant

Senior Site Reliability Engineer

ConSol Partners

Greater London

Remote

GBP 70,000 - 100,000

Today
Be an early applicant

Site Reliability Engineer

Vanloq

Birmingham

On-site

GBP 60,000 - 80,000

2 days ago
Be an early applicant

Site Reliability Engineer

Attio Ltd

London

Remote

GBP 80,000 - 100,000

Today
Be an early applicant

Site Reliability Engineer (Home-based)

JR United Kingdom

London

Remote

GBP 60,000 - 80,000

5 days ago
Be an early applicant

Senior Site Reliability Engineer UK - Remote

StarRez, Inc.

Remote

GBP 60,000 - 80,000

4 days ago
Be an early applicant

Site Reliability Engineer (Europe)

P2P

Remote

GBP 40,000 - 80,000

7 days ago
Be an early applicant

Site Reliability Engineer

ZipRecruiter

Chelmsford

Remote

GBP 60,000 - 100,000

11 days ago

Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First

JR United Kingdom

Greater Manchester

Remote

GBP 60,000 - 100,000

8 days ago