Enable job alerts via email!

Site Reliability Engineering Manager

JR United Kingdom

London

On-site

GBP 70,000 - 110,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is seeking a Site Reliability Engineering Manager to lead the SRE function across Europe. This role is pivotal in ensuring the reliability and performance of critical infrastructure while collaborating with various teams to enhance service delivery. The ideal candidate will possess strong technical leadership and a passion for operational excellence, driving initiatives that improve system availability and scalability. Join a high-impact team where your expertise will contribute to a culture of ownership and innovation, making a significant impact on global operations.

Qualifications

  • 7+ years in a technical SRE or DevOps position.
  • 2+ years in a leadership or senior engineering capacity.

Responsibilities

  • Lead the Site Reliability Engineering function across Europe.
  • Build and mentor a high-performing SRE team focused on innovation.

Skills

Technical Leadership
Operational Excellence
Cross-Functional Collaboration
Capacity Planning
Monitoring & Analytics

Education

Bachelor’s degree in Computer Science
Master’s degree in Engineering

Tools

Jira
AWS
Kubernetes
Datadog
Prometheus
Grafana
Terraform
Ansible
Pulumi
SQL

Job description

Social network you want to login/join with:

Site Reliability Engineering Manager, london

col-narrow-left

Client:

Signify Technology

Location:

london, United Kingdom

Job Category:

Other

-

EU work permit required:

Yes

col-narrow-right

Job Views:

4

Posted:

28.04.2025

Expiry Date:

12.06.2025

col-wide

Job Description:

The SRE Manager is responsible for leading the Site Reliability Engineering function across Europe, ensuring the reliability, scalability, and performance of critical infrastructure and services. This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to support platforms worldwide.

The ideal candidate will bring strong technical leadership, deep subject matter expertise, and a passion for operational excellence to a high-impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a regional SRE team focused on continuous improvement and innovation.

Key Responsibilities:

Technical Leadership

  • Develop deep expertise in the Titanium trading platform to lead and support critical business operations.
  • Oversee team workload, ensuring priorities align with business goals and resource capacity.

Operational Excellence

  • Champion initiatives that enhance system availability, scalability, and performance.
  • Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery).

Cross-Functional Collaboration

  • Partner with Software Engineering, Infrastructure, Operations, Security, and Business teams to deliver secure and reliable platforms.

Team Development

  • Build, lead, and mentor a high-performing SRE team in Europe, fostering a culture of ownership, collaboration, and innovation.
  • Lead response efforts for critical incidents, ensuring swift resolution and comprehensive root cause analysis.
  • Drive long-term improvements based on lessons learned from Learning Reviews, and maintain accurate incident documentation and compliance reporting.
  • Lead automation initiatives to streamline workflows and increase uptime.
  • Use Jira to manage tasks and projects, and align global SRE practices for seamless support.

Capacity Planning

  • Drive timely capacity planning to prevent last-minute issues.
  • Support budget planning to align infrastructure investments with growth and performance targets.
  • Participate in quarterly capacity reviews and follow up on outcomes.

Monitoring & Analytics

  • Oversee the implementation of monitoring and alerting systems to detect and resolve issues proactively—before customer or compliance impacts occur.

Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred)
  • 7+ years in a technical SRE, DevOps Position
  • 2+ years in a leadership or senior engineering capacity

Preferred Skills:

  • Proficiency in SQL and data analytics tools (e.g., Sigma, Snowflake)
  • Experience with FIX protocol and market data analysis
  • proficient in AWS, Kubernetes, monitoring tools (Datadog, Prometheus, Grafana), and automation frameworks (Terraform, Ansible, Pulumi)

For more information, please apply with a relevant CV.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.