Job Search and Career Advice Platform

Enable job alerts via email!

Senior Site Reliability Engineer

Department for Work and Pensions (DWP)

Birmingham

On-site

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A governmental organization in Birmingham seeks a Senior Site Reliability Engineer to ensure application reliability and performance. Your role will include driving SRE best practices, collaborating with development teams, and managing incidents effectively. This position offers a unique opportunity to be part of a significant digital transformation, fostering innovation and engineering ownership. The ideal candidate will demonstrate strong technical and soft skills, playing a vital role within the team.

Benefits

Disability Confident employer status

Qualifications

  • Proven experience as a Senior Site Reliability Engineer or similar role.
  • Ability to manage incident response and resolution.
  • Strong soft skills to collaborate with cross-functional teams.

Responsibilities

  • Design and develop techniques to improve application reliability.
  • Collaborate with development teams to ensure best practices.
  • Manage error budgets in alignment with product owners.
  • Coach and mentor app development and operations engineers.
  • Provide on-call support and reduce toil through automation.

Skills

SRE best practices
Cloud technologies
Collaboration with development teams
Technical direction
Incident management
Job description

This is a fantastic opportunity to join DWP and to be part of the one the biggest digital transformations in Europe, implementing leading edge technologies with the user at the centre of everything we do - we create innovative digital solutions that make a difference to the lives of our 22 million users. We are looking for Senior Site Reliability Engineers (SRE) to join one of our SRE teams at the heart of Digital Transformation. As a Senior Site Reliability Engineer, you will drive adoption of SRE best practice across our cloud estate. By utilising both your soft skills and technical experience, you will work with teams to ensure our standards and governance is met by onboarding our services into the cloud, through a dedicated assessment stage gate process. In turn, ensuring our citizen facing applications satisfy all the required operational and security needs for running in production. Please note this role requires you to pass Security Check clearance. For further information, please see 'Selection process details'. As a Senior Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our applications and infrastructure. The SRE team will put you in the position to work with application teams across the department on developing reliable and secure solutions to provide to citizens across the UK. You will lead by example, providing technical direction and supporting other SREs within your team. You will work with development teams from the design phase to help them use good practice and department standards when building their application infrastructure. Additionally, responsibilities of the role will include:

  • Design and develop the techniques for improving application reliability, run books, knowledge transfer across teams, and ongoing SRE strategy within your Functional and Professional Communities.
  • Work collaboratively with development teams and provide guidance around best practice and ensure monitoring of applications is enabled.
  • Push a mindset change within the organisation to foster engineering ownership, SRE best practice and the importance of the integrity and maintenance of the Live Service.
  • Manage the error budget agreed with the product owner for the application and ensure that work is balanced in alignment with it.
  • Act as the focal point for the investigation and resolution of major or complex incidents for the service, ensuring people with the right skills and expertise are proactively available to respond effectively.
  • Assess the impact of change requests in consultation with stakeholders, providing technical expertise and authorising the implementation of subsequent changes.
  • Coach and mentor application development and operations engineers in the practice and techniques of SRE.
  • Conduct reviews for all high priority and major incidents ensuring they are done quickly and published.
  • Routinely seek views and capture ideas from stakeholders and team members for improvements and encourage collaboration and innovation.
  • Provide on-call support to help restore services, through dedicated run books or technical experience.
  • Help to reduce toil and increase automation; by developing reliability to ensure we have a reduction of the time to live, and cost spend on repetitive tasks.
    Disability Confident
  • About Disability Confident A Disability Confident employer will generally offer an interview to any applicant that declares they have a disability and meets the minimum criteria for the job as defined by the employer. It is important to note that in certain recruitment situations such as high-volume, seasonal and high-peak times, the employer may wish to limit the overall numbers of interviews offered to both disabled people and non-disabled people. For more details please go to .
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.