Enable job alerts via email!

Lead Site Reliability Engineer

JR United Kingdom

London

On-site

GBP 60,000 - 100,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a talented Site Reliability Engineer to join their high-impact team in London. This role is pivotal in enhancing operational excellence and providing technical leadership in a dynamic environment. You will collaborate with cross-functional teams to ensure high availability and resilient service delivery, while mentoring a dedicated SRE team. If you are passionate about driving innovation and continuous improvement in a fast-paced setting, this opportunity is perfect for you to make a significant impact.

Qualifications

  • 5+ years in a technical SRE or DevOps position.
  • 2+ years in a leadership or senior engineering role.

Responsibilities

  • Develop expertise in the Titanium trading platform to support operations.
  • Champion initiatives for system availability and performance.

Skills

Technical Leadership
Operational Excellence
Collaboration
Incident Management
Capacity Planning
Data Analytics

Education

Bachelor’s degree in Computer Science
Master’s degree in related field

Tools

AWS
Jira
Datadog
Prometheus
Grafana
Terraform
Ansible
Pulumi

Job description

Social network you want to login/join with:

This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to support platforms worldwide. We are looking for SRE talent with experience in an On-Prem / Datacenter environment.

The ideal candidate will bring strong technical leadership, experience in an On-Prem / Datacenter environment, and a passion for operational excellence to a high-impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a SRE team focused on continuous improvement and innovation.

Key Responsibilities:

Technical Leadership

  • Develop deep expertise in the Titanium trading platform to lead and support critical business operations.
  • Oversee team workload, ensuring priorities align with business goals and resource capacity.

Operational Excellence

  • Champion initiatives that enhance system availability, scalability, and performance.
  • Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery).

Cross-Functional Collaboration

  • Partner with Software Engineering, Infrastructure, Operations, Security, and Business teams to deliver secure and reliable platforms.

Team Development

  • Build, lead, and mentor a high-performing SRE team in Europe, fostering a culture of ownership, collaboration, and innovation.
  • Lead response efforts for critical incidents, ensuring swift resolution and comprehensive root cause analysis.
  • Drive long-term improvements based on lessons learned from Learning Reviews, and maintain accurate incident documentation and compliance reporting.
  • Lead automation initiatives to streamline workflows and increase uptime.
  • Use Jira to manage tasks and projects, and align global SRE practices for seamless support.

Capacity Planning

  • Drive timely capacity planning to prevent last-minute issues.
  • Support budget planning to align infrastructure investments with growth and performance targets.
  • Participate in quarterly capacity reviews and follow up on outcomes.

Monitoring & Analytics

  • Oversee the implementation of monitoring and alerting systems to detect and resolve issues proactively—before customer or compliance impacts occur.

Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred)
  • 5+ years in a technical SRE, DevOps Position
  • 2+ years in a leadership or senior engineering capacity

Preferred Skills:

  • Proficiency in SQL and data analytics tools (e.g., Sigma, Snowflake)
  • Experience in AWS, monitoring tools (Datadog, Prometheus, Grafana), and automation frameworks (Terraform, Ansible, Pulumi)

For more information, please apply with a relevant CV.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead Site Reliability Engineer

Board Intelligence Limited

London

On-site

GBP 60,000 - 100,000

Yesterday
Be an early applicant

Lead Platform Architect (m/f/d)-AI

TN United Kingdom

Greater London

Remote

GBP 70,000 - 110,000

4 days ago
Be an early applicant

Lead Site Reliability Engineer

Signify Technology

Greater London

On-site

GBP 65,000 - 95,000

9 days ago

Lead Site Reliability Engineer

JR United Kingdom

Greater London

On-site

GBP 60,000 - 100,000

4 days ago
Be an early applicant

Lead Site Reliability Engineer

ZipRecruiter

London

On-site

GBP 60,000 - 100,000

3 days ago
Be an early applicant

Principal Site Reliability Engineer

JR United Kingdom

Staines-upon-Thames

On-site

GBP 95,000 - 120,000

2 days ago
Be an early applicant

Lead Site Reliability Engineer

loveholidays

London

On-site

GBP 50,000 - 90,000

8 days ago

3x Lead Reliability Engineer - Kent, Hampshire and Sussex

TN United Kingdom

Greater London

Hybrid

GBP 50,000 - 63,000

4 days ago
Be an early applicant

Lead Site Reliability Engineer

Boehringer Ingelheim

Guildford

On-site

GBP 60,000 - 100,000

9 days ago