Enable job alerts via email!

Site Reliability Engineer

Orgvue

London

Hybrid

GBP 70,000 - 100,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Principal Site Reliability Engineer, where you will lead the charge in scaling and enhancing AWS and Kubernetes infrastructure. This role is pivotal in shaping a culture of reliability and operational excellence, allowing you to collaborate across teams and drive innovative solutions. With a focus on automation, incident management, and observability, you will help create a robust engineering foundation that supports growth and adaptability in a dynamic environment. Embrace the opportunity to make a significant impact in a diverse and inclusive workplace.

Benefits

Subsidised Gym Membership
Private Medical Insurance
Annual Discretionary Bonus
Cycle to Work Scheme
Employer pension contribution
25 days holiday
Summer Fridays
Season ticket Loan
Wellbeing Coaching
Virtual fitness sessions

Qualifications

  • Experience leading SRE transformations and mentoring engineers.
  • Deep hands-on expertise with Kubernetes and AWS core services.

Responsibilities

  • Define SLOs, SLIs, and error budgets across critical services.
  • Implement observability metrics and guide automated system development.

Skills

Kubernetes
AWS core services
Infrastructure as Code (Terraform)
Observability practices
Incident management
Automation and CI/CD pipelines

Tools

Terraform
GitOps
CloudFormation

Job description

Orgvue is an organisational design and planning platform that empowers your business to transform its workforce by understanding the work people do and the skills they have. Our platform connects strategy to structure, providing clarity of vision, so you can build a more adaptable, better performing organisation that thrives in a constantly changing world of work.

The world’s largest and best-known enterprises and consulting firms use Orgvue to visualise and model current and future states of the organisation and make faster, more informed decisions. The company is headquartered in London, with offices in Philadelphia, The Hague, Toronto, and Sydney.

As a Principal Site Reliability Engineer, you will be a senior technical leader focused on scaling and hardening our AWS- and Kubernetes-based infrastructure. You will work across product, platform, and operations teams to ensure our systems are reliable, observable, and resilient — even at scale.

This role combines hands-on technical capability with strategic vision, helping us build a world-class reliability culture and a robust engineering foundation for growth. We're looking for someone who has technical expertise, is a great communicator and enjoys collaborating across multiple teams.

Key Responsibilities:

  1. Define and enforce SLOs, SLIs, and error budgets across critical services.
  2. Design and implement cloud infrastructure and tooling strategies.
  3. Enhance SRE practices across the organization.
  4. Implement observability metrics, logs, and traces using our observability tools.
  5. Guide the team in building automated, self-healing systems.
  6. Own and evolve incident response processes, including on-call practices and post-mortem culture.
  7. Mentor engineers on reliability, operational readiness, and scalable infrastructure best practices.
  8. Drive Infrastructure as Code (IaC) initiatives using Terraform, Kubernetes, CloudFormation, and GitOps.
  9. Collaborate with security, DevOps, and software teams to ensure compliance and operational excellence.
  10. Evaluate and introduce tools and practices to improve platform performance and reliability.

Desired Skills & Experience:

  1. Experience leading SRE transformations.
  2. Hands-on expertise with Kubernetes (EKS preferred) in production.
  3. Strong experience with AWS core services (EC2, EKS, RDS, S3, ALB/NLB, IAM, CloudWatch, etc.).
  4. Expertise in Infrastructure as Code using Terraform and knowledge of GitOps workflows.
  5. Background in observability: metrics, visualization, logging, and tracing.
  6. Understanding automation, CI/CD pipelines, deployment automation, and release strategies.
  7. Experience with incident management, disaster recovery, root cause analysis, and post-incident reviews.

Additional Benefits:

  • Hybrid working: 1+ days a week in London office.
  • Wellbeing initiatives including coaching, fitness sessions, webinars, and an annual wellbeing day.
  • Subsidised gym membership.
  • Private medical insurance (including dental and vision) and life assurance.
  • 25 days holiday, increasing to 30 days with service.
  • Summer Fridays (half-days in July and August).
  • Employer pension contribution of 5% (with a minimum employee contribution of 3%).
  • Season ticket loan.
  • Cycle to Work Scheme.
  • Annual discretionary bonus.

Here at Orgvue, we promote individualism and a diverse workforce to build on our future success.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Site Reliability Engineer

Auros

Greater London

Remote

GBP 60 000 - 100 000

11 days ago

Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First

Future Talent Group

Greater London

Remote

GBP 50 000 - 90 000

14 days ago

Site Reliability Engineer, Americas

TN United Kingdom

London

Remote

GBP 55 000 - 90 000

14 days ago

Remote Site Reliability Engineer

TN United Kingdom

London

Remote

GBP 60 000 - 100 000

14 days ago

Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First

JR United Kingdom

London

Remote

GBP 60 000 - 95 000

11 days ago

Site Reliability Engineer

ZipRecruiter

Chelmsford

Remote

GBP 60 000 - 100 000

5 days ago
Be an early applicant

Site Reliability Engineer

Eligo Recruitment

Greater London

Remote

GBP 80 000 - 95 000

9 days ago

Site Reliability Engineer, EMEA

TN United Kingdom

London

Remote

GBP 50 000 - 90 000

14 days ago

Reliability, Engineer

Jones Lang LaSalle Incorporated

London

Remote

GBP 50 000 - 90 000

15 days ago