Enable job alerts via email!

Site Reliability Engineer Senior Lead

Mars

Slough

On-site

GBP 125,000 - 150,000

Full time

9 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A multinational organization in Slough seeks a Senior Lead for Systems Reliability Engineering. In this pivotal role, you will ensure the reliability, performance, and scalability of critical systems, overseeing automation and enhancing observability. The ideal candidate has over 7 years in IT and strong leadership skills, particularly within SRE and DevOps practices. This position offers competitive pay and benefits.

Benefits

Best-in-class learning and development support

Industry competitive salary and benefits package

Company bonus

Qualifications

7+ years of experience in IT departments or a relevant field.
3+ years in a leadership, SRE, DevOps, or systems engineering role.
Experience in issue and problem management in a multicultural environment.

Responsibilities

Design, implement, and maintain highly available and scalable systems.
Monitor system performance, reliability, and security.
Develop and maintain automated CI/CD pipelines.

Skills

Site Reliability Engineering principles

DevOps best practices

Analytical skills

Interpersonal skills

Organizational skills

Proficiency with cloud platforms

Configuration management

Scripting

Monitoring and observability tools

Communication skills

Education

Bachelor’s degree in Information Technology, Computer Science, Business Management, or related field

Tools

Terraform

Ansible

Job Description :

The Systems Reliability Engineering (SRE) Senior Lead is a pivotal leader within our organization, responsible for ensuring the reliability, performance, and scalability of our critical systems. This role is instrumental in strategizing and overseeing reliability with an end-to-end service delivery perspective, aligning technical infrastructure with business objectives to meet evolving customer needs. As an influential figure in our company, the Systems Reliability Engineering Senior Lead will spearhead initiatives to automate infrastructure, enhance system observability, and drive the transformation of our IT operations.

What are we looking for?

Bachelor’s degree in Information Technology, Computer Science, Business Management, or a related field
7+ years of experience in IT departments or a relevant field
3+ years in a leadership, SRE, DevOps, or systems engineering role
A seasoned professional with a deep understanding of Site Reliability Engineering (SRE) principles, DevOps best practices, and cutting‑edge technologies.
Strong analytical, interpersonal, and organizational skills with a proven track record in issue and problem management in a multicultural and global environment.
Proficiency with cloud platforms and experience in configuration management, scripting, and monitoring and observability tools.
Understanding of business processes, change management, and ITSM processes, including service level management and reporting.
Excellent communication skills and the ability to work collaboratively with cross‑functional teams.

What will be your key responsibilities?

Systems Reliability Engineering Senior Lead is to ensure that the technology stack being deployed and its ability to be supported accordingly with the business requirements, the focus is in the infra tech stack and IT Operations support model :

System Reliability, Performance and best practices :

Design, implement, and maintain highly available and scalable systems.
Monitor system performance, reliability, and security using advanced monitoring and logging tools.
Proactively identify and resolve issues that could impact service availability.
Conduct assessments to ensure systems comply with market standards and best practices.

Automation and Infrastructure as Code (IaC) :

Develop and maintain automated CI / CD pipelines to streamline deployments.
Implement Infrastructure as Code (IaC) using tools like Terraform, Ansible, or Others.
Automate repetitive tasks to increase system efficiency and reliability.

Collaboration and DevOps Culture :

Collaborate with software development teams to ensure new features are built with reliability in mind.
Advocate for best practices in software engineering, deployment, and operations and foster a culture of collaboration and continuous improvement across teams.

Capacity Planning and Scaling :

Conduct capacity planning to anticipate future growth and scaling needs.
Implement strategies to efficiently scale systems based on demand.

What can you expect from Mars?

Work with over 140,000 diverse and talented Associates, all guided by the Five Principles.

Join a purpose driven company, where we’re striving to build the world we want tomorrow, today.

Best-in-class learning and development support from day one, including access to our in‑house Mars University.

An industry competitive salary and benefits package, including company bonus.

#TBDDT

Mars is an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law. If you need assistance or an accommodation during the application process because of a disability, it is available upon request. The company is pleased to provide such assistance, and no applicant will be penalized as a result of such a request.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.