Enable job alerts via email!

Site Reliability Engineer

Damia Group

England

Hybrid

GBP 50,000 - 80,000

Full time

Yesterday
Be an early applicant

Job summary

A leading technology firm in the UK is seeking a Security Cleared Site Reliability Engineer for a contract role. You will lead operations over a legacy technology estate, ensuring service stability and operational readiness. The position requires strong skills in Java, AWS, and Kubernetes, as well as the ability to manage incidents and mentor team members. The role offers a hybrid work model and supports professional development in a multi-vendor environment.

Qualifications

  • Current and active SC Clearance.
  • Experience in managing/supporting enterprise-scale services.
  • Familiarity with ITIL-aligned service management.

Responsibilities

  • Lead daily operational support of legacy systems.
  • Manage incident, problem, and change activities.
  • Act as escalation point for critical incidents.
  • Define and maintain operational documentation.
  • Oversee job scheduling and automation activities.

Skills

Java
AWS
Kubernetes
Incident Management
Leadership skills
Analytical abilities
Documentation skills
Job description
Overview

Security Cleared Site Reliability Engineer - Contract Outside IR35 - 3 months+ - Hybrid

We are seeking a Lead Operations/Site Reliability Engineer to take ownership of day-to-day operations across a legacy technology estate. The role will focus on maintaining service stability, ensuring operational readiness, and leading the response to incidents and outages. The Lead Operations/Site Reliability Engineer will play a pivotal role during the transition phase by embedding operational standards, improving monitoring and support processes, and enabling knowledge transfer into ongoing service delivery teams.

Responsibilities
  • Lead daily operational support of legacy systems, ensuring availability, performance, and resilience.
  • Manage incident, problem, and change activities in line with ITIL and enterprise service standards. Proactively monitor and tune infrastructure, applications, messaging, and scheduling platforms.
  • Act as the escalation point for critical incidents, coordinating technical resources to achieve rapid resolution. Lead root cause analysis and service improvement initiatives.
  • Define and maintain runbooks, standard operating procedures, and operational documentation.
  • Ensure backup, recovery, and disaster recovery processes are operationally tested and aligned to business needs.
  • Oversee job scheduling, batch management, and automation activities (e.g., Tivoli Scheduler).
  • Collaborate with Infrastructure, Development, and Architecture teams to support upgrades, migrations, and modernisation efforts.
  • Mentor operations engineers and manage knowledge transfer from discovery into business-as-usual operations.
Competencies
  • Technical background in Java, AWS and Kubernetes
  • Customer Engagement management. Strong leadership and coordination skills across technical and non-technical stakeholders.
  • Excellent analytical and diagnostic abilities, with a structured approach to discovery and documentation. Skilled in documenting processes, monitoring metrics, and reporting on operational health.
  • Excellent communication and documentation skills for effective knowledge capture and handover. Excellent communication skills, particularly in high-pressure incident management situations.
  • Ability to operate in both deep technical detail and higher-level architectural/system view.
  • Analytical and detail-oriented, with a continuous improvement mindset.
  • Incident Management, Resilient under pressure and effective at prioritising competing demands.
Experiences
  • Ability to prioritise effectively in a complex, multi-system environment.
  • Technical skills in Java, AWS and Kubernetes
  • Current and active SC Clearance
  • Proven track record in managing/supporting enterprise-scale services.
  • Experience of working in a multivendor environment, where co-ordination, triage and joint working is essential for operational activities.
  • Familiarity with ITIL-aligned service management.
  • Building KT libraries
  • Previous involvement in system migrations, re-platforming, or legacy modernisation programmes highly desirable.
  • Background in high-availability, disaster recovery, and enterprise integration patterns.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.