Site Reliability Engineer

PEXA Group
United Kingdom
GBP 70,000 - 75,000
Job description

Hi, we’re PEXA!

We know you’ll Google us before applying, so let’s keep this brief. PEXA revolutionised the way that property is settled in Australia, turning a paper-based process into a digital one. Our solution is a world-first, with over 500 people across Australia and an expanding international team, we’re helping 20,000+ families into their homes each week.

We’re passionate about solving problems for our customers – always striving to set the standard for how property is bought and sold. Being awarded as one of the best places to work in Australia is a recognition of our culture and commitment to innovation, customers and our community.

We’re growing fast, that is where you come in.

We believe our success in Australia is worth sharing and that our proven technology will advance how the UK buys and sells homes.

Establishing ourselves within the UK in late 2020, we are committed to collaborating with lawyers, conveyancers, lenders, government and the property industry, to set the new standard for both remortgages and buying and selling property.

Why become a PEXArian?

Great question! Being a PEXArian is so much more than just a job. We’re a passionate, motivated and unashamedly enthusiastic bunch at PEXA – we love what we do and we’re proud to admit it! Creating brilliant experiences for our members and their clients wouldn’t be possible without ensuring we deliver an exceptional employee experience.

Here’s a snapshot of what your life at PEXA could look like:

Your growth:

We encourage you to hit your personal and professional learning and development goals with our tailored programs and tools.

Your wellness:

We care about your holistic wellbeing.

Your work/life blend:

We know that work is just one aspect of your life – we want to help you create your ideal work/life blend, rather than squeezing in life around work.

The Site Reliability Engineer is responsible for the technical support and operation of UK Platforms (both from an application and infrastructure perspective) by actively managing all incidents to resolution and supporting software releases. The role endeavours to make sure that PEXA Groups support offering for our platform adheres to the highest level of operational and security requirements but at the same time deliver a seamless and secure support experience to our customers.

The role is also responsible for additional activities including (but not limited to) application (E.g. SWIFT SILs), OS and Infrastructure patching, DR testing, creation of alerting and monitoring and service transition activities – knowledge transfer, operation playbook updates/knowledge articles update.

The SRE will closely collaborate with the customer support team and the product development squads in various global locations to achieve the best outcome for the technical support of PEXA’s customers and integrated partners as well as working closely with PEXA AU run teams to ensure alignment of PEXA’s strategic direction of creating a consistent and “best in class” support experience for PEXA’s customers globally.

Overall, this role follows through on the vision and execution of the technical support function, is the contact point for technical incidents as well as for the support teams.

Key Accountabilities
  • Ensure high availability and reliability of UK platforms with day-to-day support.
  • Manage incidents with rapid resolution, root cause analysis, and post-mortems to prevent recurrence.
  • Optimise monitoring and alerting to enable proactive issue detection and fast response.
  • Identify process improvements and suggest service management enhancements for long-term stability.
  • Report problems, risks, issues, and change requests to minimise downtime.
  • Coordinate resolution and escalation of Platform Services issues, fostering cross-team collaboration.
  • Manage the Production environment, overseeing incidents, fixes, performance, and stability.
  • Drive continuous improvement by automating processes and enhancing operational performance.
  • Help define the cloud platform service roadmap to enhance system reliability.
  • Collaborate with UK Support and Delivery Squads to address pain points and add value.
  • Assist squads in estimating and resolving Platform Defects that cause incidents.
  • Oversee application, OS, and infrastructure patching, DR testing, monitoring setup, KT, and updating operational playbooks and knowledge articles.
Knowledge & Skills
  • Distributed systems in AWS and/or Azure cloud environments.
  • Bring a developer mindset to platform challenges, understanding how software and infrastructure are designed, implemented, and integrated.
  • Strong knowledge of container orchestration and scaling, with experience in managing and troubleshooting workloads.
  • Experience of managing Kubernetes clusters, service mesh and hosted workloads.
  • Proficient in observability and monitoring tools, including configuring alerts, creating dashboards, and conducting root cause analysis. Some of the tools we use are: Grafana, Prometheus, Elastic, Splunk.
  • Configuring incident management platforms such as PagerDuty.
  • Hands-on experience with Infrastructure-as-Code (IaC) and automation to improve operational efficiency, using tools like Terraform, Bicep or CloudFormation.
  • Strong understanding of modern SDLC and CI/CD processes, with experience in scripting, automation and version control systems such as Git.
  • Collaborating in DevSecOps upholding security best practices and compliance standards. Understanding of security frameworks such as Azure or AWS Well-Architected Frameworks.
  • Experience in high availability (HA) and disaster recovery (DR) strategies and execution.
  • Adept at collaborating with diverse teams across cultures and working effectively under pressure.
  • Empathetic team player who builds strong relationships, tackles challenges, and delivers results while maintaining quality and team morale.
  • Strong understanding of Agile principles, excellent communication skills, and a customer-centric mindset.
£70,000 - £75,000 a year

Sounds like you?

We at PEXA are ready so if this role sounds like you apply today.

Get a free, confidential resume review.
Select file or drag and drop it
Avatar
Free online coaching
Improve your chances of getting that interview invitation!
Be the first to explore new Site Reliability Engineer jobs in United Kingdom