Enable job alerts via email!

Head of Site Reliability Engineering

Rewardgateway

London

On-site

GBP 100,000 - 120,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Head of Site Reliability Engineering to transform operational workloads into an SRE approach. This pivotal role involves establishing a new SRE function, modernizing cloud infrastructure, and ensuring high availability and performance for customer-facing systems. You will lead teams in adopting SRE practices, balancing innovation with stability, and driving cost efficiency. Join a dynamic environment where your leadership and technical skills will significantly impact employee engagement and organizational resilience, contributing to a mission of making the world a better place to work.

Qualifications

  • Proven leadership in SRE within a global organization.
  • Experience with AWS and automation tools like Terraform.
  • Deep understanding of SRE practices and observability.

Responsibilities

  • Establish and manage the new SRE function.
  • Operate and modernize existing cloud infrastructure.
  • Educate teams in SRE practices and ensure compliance.

Skills

Leadership and management experience
AWS or another cloud provider
Enterprise infrastructure experience
Automation skills (Terraform, Python, Bash)
SRE practices knowledge
SQL, PHP, Kubernetes, CI/CD
Observability tools (New Relic, Datadog)
Strong facilitation and servant leadership skills
Ability to work under pressure
Adaptability and flexibility

Tools

JIRA
Confluence

Job description

Engineering, London, Full Time, £100,000 - £120,000 / year
Job Description

In May 2023 Reward Gateway was acquired by Edenred. Edenred is a leading digital platform for services and payments for people at work, connecting 52 million users and 2 million partner merchants in 45 countries via close to 1 million corporate clients.

With our shared missions of ‘Making the World a Better Place to Work’ and ‘Enriching connections, For good’, you’ll contribute to improving employee engagement and building better, stronger, and more resilient organisations to improve people’s daily lives. Our shared mission guides our actions and charts a sustainable path to a better future.


Due to expansion, an opportunity has become available for a Head of Site Reliability Engineering to join our team to help us transform our existing operational workloads to an SRE approach.

Key Responsibilities
  • Establishing and managing our new SRE function
  • Operating and modernising our existing cloud infrastructure
  • Partnering with our DevOps team to ensure fast & supportable platform updates
  • Maintaining the highest standards for our customer-facing systems
  • Balancing the desire for innovation with stability and delivery for our customers
  • Ensuring our availability and performance are maintained at the highest levels
  • Acting as a key Incident Commander and escalation point
  • Liaising closely with our SecOps teams to ensure timely vulnerability management
  • Educating teams in SRE practices and maintaining high standards of compliance
  • Implementing world-class observability standards utilising SLI/SLO/Error Budgets
  • Continually evolving our observability platforms for greater coverage
  • Liaising with Product & Engineering teams for constant evolution of metrics
  • Aligning SRE Sprints & Backlog with our roadmaps to meet business expectations
  • Guiding our teams in a more Agile approach to demand management
  • Actively taking part in our daily stand-ups and keeping our Sprints on track
  • Keeping up-to-date documentation in our JIRA & Confluence tools
  • Owning and maintaining our SRE Incident Management processes
  • Ensuring a focus on cost efficiency for our platforms & services
  • Removing obstacles and fostering team collaboration
  • Communicating with our stakeholders
Skills
  • Demonstrated leadership and management experience as a Senior Manager or Head of SRE within a global organisation
  • Experience with AWS preferred (or another cloud provider)
  • Enterprise infrastructure experience in high-availability environments
  • Automation skills through Terraform, Python, Bash or similar
  • Fast-releasing environments with automated pipelines and QA
  • Wide-reaching SRE skills and a deep understanding of SRE practices
  • SRE leadership skills with an ability to drive SRE adoption
  • A strong understanding of SQL, PHP, Kubernetes, CI/CD
  • Observability product experience (eg: New Relic, Datadog)
  • Strong facilitation and servant leadership skills
  • Ability to work both independently and as part of a team
  • Ability to work under pressure and be highly reliable
  • Leadership, time management, and organisational skills
  • Adaptability and flexibility to change in a fast-moving environment
  • An ability to learn new tools and processes quickly and impart that knowledge
The Interview Process
  • Screening video interview with the Senior Talent Partner
  • Interview with the Director of Infrastructure and Head of Development
  • Final interview with the Director of Engineering & CTO

Be comfortable. Be you.
At Reward Gateway, we want our employees to feel comfortable bringing their passion, creativity, and individuality to work. We value all cultures, backgrounds, and experiences, as we truly believe that diversity drives innovation. Express yourself, join our community and help us Make the World a Better Place to Work.

We hire BETTER.
From perks to people, our BETTER approach to hiring earns us more trust, happier people and more world-class talent that help us to make the world a better place to work. Find out more about Reward Gateways' approach to benefits, equality, talent, technology, and empathy, and what you’ll get in return for joining our Mission at rg.co/lifeatrg.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Head of Site Reliability Engineering (SRE)

JR United Kingdom

City Of London null

On-site

On-site

GBP 80,000 - 120,000

Full time

11 days ago

Head of Site Reliability Engineering (SRE)

JR United Kingdom

London null

On-site

On-site

GBP 85,000 - 130,000

Full time

26 days ago

Head of Site Reliability Engineering - Midnight

Io Me

London null

On-site

On-site

GBP 90,000 - 130,000

Full time

30+ days ago

Head of Site Reliability Engineering (SRE)

ZipRecruiter

London null

On-site

On-site

GBP 90,000 - 150,000

Full time

22 days ago

Head of Site Reliability Engineering & Platform

DeepL

London null

Hybrid

Hybrid

GBP 90,000 - 130,000

Full time

15 days ago

Head of Site Reliability Engineering (SRE)

JR United Kingdom

Slough null

On-site

On-site

GBP 90,000 - 130,000

Full time

26 days ago

Site Reliability Engineering Team Lead

ZipRecruiter

Crawley null

Hybrid

Hybrid

GBP 100,000 - 140,000

Full time

14 days ago