Enable job alerts via email!

Site Reliability Engineering Manager

JR United Kingdom

Chester

On-site

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading software engineering services provider is seeking a Site Reliability Engineering Manager in Chester. The role involves leading platform stability efforts, hands-on development in Java, and collaborating with product and infrastructure teams. Ideal candidates will have strong expertise in Java, SRE practices, and experience with tools like Kafka and Oracle DB.

Qualifications

  • Strong Java expertise with backend design patterns.
  • Proven experience in SRE including monitoring and incident management.

Responsibilities

  • Lead a team of platform engineers to build high-performance systems.
  • Monitor, troubleshoot, and resolve production issues.

Skills

Java
Site Reliability Engineering
Communication

Tools

Spring
MuleSoft
Kafka
Oracle DB
Jenkins
Terraform
Ansible

Job description

Site Reliability Engineering Manager, Chester

Client: Ascendion

Location: Chester, United Kingdom

Job Category: Other

EU work permit required: Yes

Job Views: 4

Posted: 12.05.2025

Expiry Date: 26.06.2025

Job Description:

We are seeking a Platform Engineering Manager with a strong hands-on background in Java development and Site Reliability Engineering (SRE). The ideal candidate will have a broad technical skillset across Java, Spring, MuleSoft, Kafka, and Oracle DB, and must be capable of leading platform stability efforts while contributing directly to development. Experience in building scalable, resilient systems is critical. Knowledge of payment systems is a plus.

Key Responsibilities:
  • Lead a team of platform engineers in building and maintaining robust, high-performance systems.
  • Take ownership of platform stability, reliability, scalability, and performance.
  • Collaborate closely with product teams, infrastructure, and DevOps to address platform issues and implement improvements.
  • Architect and develop resilient backend systems primarily using Java, Spring, Kafka, and Oracle.
  • Implement best practices for observability, incident response, and operational excellence in line with SRE principles.
  • Drive automation and self-healing mechanisms across platform components.
  • Provide technical leadership and hands-on coding as needed.
  • Monitor, troubleshoot, and resolve production issues, conducting root cause analysis and driving long-term fixes.
Required Skills & Experience:
  • Experience of software development and platform engineering experience.
  • Strong Java expertise with deep understanding of backend design patterns and frameworks (Spring Boot preferred).
  • Proven experience in Site Reliability Engineering (SRE), including monitoring, alerting, and incident management.
  • Hands-on experience with Kafka, MuleSoft, and Oracle DB.
  • Familiarity with performance tuning, system design, and distributed computing concepts.
  • Experience with CI/CD pipelines and infrastructure-as-code (e.g., Jenkins, Terraform, Ansible) is a plus.
  • Ability to lead and mentor engineers while remaining hands-on.
  • Strong communication and cross-functional collaboration skills.
About Us:
  • Ascendion is a leading provider of AI-first software engineering services.
  • Our applied AI, software engineering, cloud, data, experience design, and talent transformation capabilities accelerate innovation for Global 2000 clients.
  • Ascendion is headquartered in New Jersey.
  • In addition to our remote/hybrid workforce, we have 30+ offices across the U.S., UK, Poland, Romania, India, Australia, and Mexico.
  • We are committed to building technology powered by Generative AI with an inclusive workforce, service to our communities, and a vibrant culture.
  • For more information, please go to www.ascendion.com.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead, Site Reliability Engineering, Infrastructure Security

MongoDB

London

Remote

GBP 60,000 - 100,000

30+ days ago

Lead, Site Reliability Engineering, Infrastructure Security London

MongoDB

London

Remote

GBP 60,000 - 100,000

30+ days ago