Enable job alerts via email!

Site Reliability Engineer

PEXA Group Limited

United Kingdom

Remote

GBP 70,000 - 75,000

Full time

Today
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is seeking a Site Reliability Engineer to enhance the support and operation of UK Platforms. This role involves ensuring high availability, managing incidents, and collaborating with teams to optimize technical support. With a focus on continuous improvement and automation, you will play a key role in shaping the future of property transactions in the UK. Join a passionate team dedicated to delivering exceptional experiences and enjoy a supportive work environment that prioritizes your growth and wellbeing.

Qualifications

  • Experience with distributed systems in AWS and Azure cloud environments.
  • Strong knowledge of container orchestration and scaling.

Responsibilities

  • Ensure high availability and reliability of UK platforms through daily support.
  • Manage incidents efficiently and perform root cause analysis.

Skills

AWS
Azure
Kubernetes
Terraform
Git
Grafana
Prometheus
Elastic
Splunk
Incident Management

Tools

PagerDuty
Bicep
CloudFormation

Job description

Hi, we’re PEXA!

We know you’ll Google us before applying, so let’s keep this brief. PEXA revolutionised the way property is settled in Australia, transforming a paper-based process into a digital one. Our solution is a world-first, with over 500 people across Australia and an expanding international team, helping 20,000+ families into their homes each week.

We’re passionate about solving problems for our customers and setting the standard for how property is bought and sold. Being awarded as one of the best places to work in Australia recognizes our culture and commitment to innovation, customers, and community.

We’re growing fast, and that’s where you come in.

We believe our success in Australia is worth sharing, and our proven technology will advance how the UK buys and sells homes.

Since establishing ourselves in the UK in late 2020, we are committed to collaborating with lawyers, conveyancers, lenders, government, and the property industry to set new standards for remortgages and property transactions.

Why become a PEXArian?

Being a PEXArian is more than just a job. We’re a passionate, motivated, and enthusiastic team who love what we do! We focus on creating exceptional experiences for our members and their clients by delivering an outstanding employee experience.

Here’s a snapshot of what your life at PEXA could look like:

Your growth:

We encourage you to pursue your personal and professional development goals with tailored programs and tools.

Your wellness:

We care about your holistic wellbeing.

Your work/life blend:

We support creating your ideal work/life balance rather than squeezing life around work.

The Site Reliability Engineer is responsible for the technical support and operation of UK Platforms (application and infrastructure), managing incidents to resolution, and supporting software releases. The role ensures that PEXA’s platform support adheres to high operational and security standards while providing a seamless and secure support experience for our customers.

This role also involves activities such as application (e.g., SWIFT SILs), OS and infrastructure patching, disaster recovery testing, alerting and monitoring setup, and knowledge transfer activities like updating operational playbooks and knowledge articles.

The SRE will collaborate closely with customer support teams and product development squads globally to optimize technical support and align with PEXA’s strategic goals of providing a consistent, best-in-class support experience worldwide.

This role is the main point of contact for technical incidents and support teams, executing the vision and strategy of the technical support function.

Key Accountabilities
  • Ensure high availability and reliability of UK platforms through daily support.
  • Manage incidents efficiently, perform root cause analysis, and conduct post-mortems to prevent recurrence.
  • Enhance monitoring and alerting systems for proactive issue detection and rapid response.
  • Identify and implement process improvements for long-term stability.
  • Report problems, risks, issues, and change requests to minimize downtime.
  • Coordinate resolution and escalation of platform issues, fostering collaboration across teams.
  • Manage the production environment, including incidents, fixes, performance, and stability.
  • Drive continuous improvement through automation and operational enhancements.
  • Contribute to defining the cloud platform service roadmap for increased system reliability.
  • Collaborate with UK Support and Delivery Squads to address challenges and add value.
  • Assist squads in estimating and resolving platform defects causing incidents.
  • Oversee application, OS, infrastructure patching, DR testing, monitoring, and documentation updates.

Knowledge & Skills

  • Experience with distributed systems in AWS and/or Azure cloud environments.
  • Developer mindset for platform challenges, understanding software and infrastructure design and integration.
  • Strong knowledge of container orchestration, scaling, and workload management.
  • Experience managing Kubernetes clusters, service mesh, and hosted workloads.
  • Proficiency with observability and monitoring tools like Grafana, Prometheus, Elastic, Splunk.
  • Experience configuring incident management platforms such as PagerDuty.
  • Hands-on with Infrastructure-as-Code (IaC) and automation tools like Terraform, Bicep, or CloudFormation.
  • Understanding of modern SDLC, CI/CD, scripting, automation, and version control (e.g., Git).
  • Knowledge of security practices, frameworks like Azure or AWS Well-Architected Frameworks, and DevSecOps.
  • Experience in high availability (HA) and disaster recovery (DR) strategies.
  • Ability to work effectively across cultures and under pressure.
  • Empathetic team player with strong relationship-building skills, problem-solving abilities, and results orientation.
  • Excellent communication skills, customer-centric mindset, and familiarity with Agile principles.
£70,000 - £75,000 a year

Sounds like you?

We at PEXA are ready to hear from you—apply today.

GDPR Compliance

Digital Completion UK Limited (trading as PEXA), Optima Legal Services Limited, and Smoove Limited are owned by DigCom UK Holdings Limited, a subsidiary of PEXA Group Limited in Australia. By applying, you consent to processing your data in accordance with UK GDPR and the Data Protection Act 2018, as detailed in our privacy notice here.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer

JR United Kingdom

Remote

GBP 50,000 - 90,000

2 days ago
Be an early applicant

Senior Site Reliability Engineer

Auros

Greater London

Remote

GBP 60,000 - 100,000

8 days ago

Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First

JR United Kingdom

Greater Manchester

Remote

GBP 60,000 - 100,000

-1 days ago
Be an early applicant

Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First

JR United Kingdom

Bolton

Remote

GBP 50,000 - 90,000

2 days ago
Be an early applicant

Site Reliability Engineer (Remote in the United Kingdom) New Sheffield, United Kingdom

KnowBe4, Inc.

Sheffield

Remote

GBP 40,000 - 80,000

14 days ago

Site Reliability Engineer – FinTech / Global Payments – London HQ / Remote First

JR United Kingdom

West Midlands Combined Authority

Remote

GBP 50,000 - 90,000

5 days ago
Be an early applicant

Site Reliability Engineer (Remote) (Position located in Sheffield, United Kingdom)

TN United Kingdom

Sheffield

Remote

GBP 50,000 - 90,000

8 days ago

Site Reliability Engineer, Americas

TN United Kingdom

London

Remote

GBP 55,000 - 90,000

11 days ago

Remote Site Reliability Engineer

TN United Kingdom

London

Remote

GBP 60,000 - 100,000

11 days ago