Enable job alerts via email!

Security Cleared Site Reliability Engineer

Damia Group Ltd

Willenhall

Hybrid

GBP 100,000 - 125,000

Full time

12 days ago

Job summary

A prominent technology firm in the United Kingdom is seeking a Lead Operations/Site Reliability Engineer for a hybrid role focusing on maintaining service stability and improving operational processes. The ideal candidate will have strong technical skills in Java, AWS, and Kubernetes, along with incident management experience. This contract position offers a dynamic environment with responsibilities including leading operational support and coordinating incident response efforts.

Qualifications

  • Technical skills in Java, AWS, and Kubernetes.
  • Current and active SC Clearance.
  • Proven track record in managing/supporting enterprise-scale services.
  • Experience in multi-vendor environments.

Responsibilities

  • Lead daily operational support of legacy systems.
  • Manage incidents, problems, and changes in line with ITIL standards.
  • Coordinate technical resources for critical incidents resolution.
  • Define and maintain operational documentation.

Skills

Java
AWS
Kubernetes
Incident Management
Analytical thinking
Job description
Overview

Security Cleared Site Reliability Engineer - Contract Outside IR35 - 3 months+ - Hybrid

We are seeking a Lead Operations/Site Reliability Engineer to take ownership of day-to-day operations across a legacy technology estate. The role will focus on maintaining service stability, ensuring operational readiness, and leading the response to incidents and outages. The Lead Operations/Site Reliability Engineer will play a pivotal role during the transition phase by embedding operational standards, improving monitoring and support processes, and enabling knowledge transfer into ongoing service delivery teams.

Key Responsibilities
  • Lead daily operational support of legacy systems, ensuring availability, performance, and resilience.
  • Manage incident, problem, and change activities in line with ITIL and enterprise service standards. Proactively monitor and tune infrastructure, applications, messaging, and scheduling platforms.
  • Act as the escalation point for critical incidents, coordinating technical resources to achieve rapid resolution. Lead root cause analysis and service improvement initiatives.
  • Define and maintain runbooks, standard operating procedures, and operational documentation.
  • Ensure backup, recovery, and disaster recovery processes are operationally tested and aligned to business needs.
  • Oversee job scheduling, batch management, and automation activities (e.g., Tivoli Scheduler).
  • Collaborate with Infrastructure, Development, and Architecture teams to support upgrades, migrations, and modernisation efforts.
  • Mentor operations engineers and manage knowledge transfer from discovery into business-as-usual operations.
Competencies
  • Technical background in Java, AWS and Kubernetes
  • Customer Engagement management. Strong leadership and coordination skills across technical and non-technical stakeholders.
  • Excellent analytical and diagnostic abilities, with a structured approach to discovery and documentation. Skilled in documenting processes, monitoring metrics, and reporting on operational health.
  • Excellent communication and documentation skills for effective knowledge capture and handover. Excellent communication skills, particularly in high-pressure incident management situations.
  • Ability to operate in both deep technical detail and higher-level architectural/system view.
  • Analytical and detail-oriented, with a continuous improvement mindset.
  • Incident Management, Resilient under pressure and effective at prioritising competing demands.
Experiences
  • Ability to prioritise effectively in a complex, multi-system environment.
  • Technical skills in Java, AWS and Kubernetes
  • Current and active SC Clearance
  • Proven track record in managing/supporting enterprise-scale services.
  • Experience of working in a multivendor environment, where co-ordination, triage and joint working is essential for operational activities.
  • Familiarity with ITIL-aligned service management.
  • Building KT libraries
  • Previous involvement in system migrations, re-platforming, or legacy modernisation programmes highly desirable.
  • Background in high-availability, disaster recovery, and enterprise integration patterns.

Damia Group Limited acts as an employment agency for permanent recruitment and employment business for the supply of temporary workers. By applying for this job you accept our Data Protection Policy which can be found on our website.

Please note that no terminology in this advert is intended to discriminate on the grounds of a person\'s gender, marital status, race, religion, colour, age, disability or sexual orientation. Every candidate will be assessed only in accordance with their merits, qualifications and ability to perform the duties of the job.

Damia Group is acting as an Employment Business in relation to this vacancy and in accordance to Conduct Regulations 2003

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.