Enable job alerts via email!

Lead Site Reliability Engineer

TN United Kingdom

Glasgow

On-site

GBP 70,000 - 100,000

Full time

22 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a globally recognized firm as a Lead Site Reliability Engineer, where you will play a critical role in enhancing the reliability of applications and platforms. You will lead a team, mentor engineers, and drive initiatives that leverage data-driven analytics to improve service levels. This position offers the chance to work with advanced technologies and collaborate across various teams to ensure the highest standards of operational excellence.

Qualifications

  • Fluency in Python, Java Spring Boot, or Unix Shell.
  • Experience with CI/CD tools and containerization.
  • Deep knowledge of technical processes and software applications.

Responsibilities

  • Lead initiatives to improve reliability and stability of applications.
  • Serve as the main contact during major incidents.
  • Document and share knowledge within the organization.

Skills

Reliability
Scalability
Performance
Security
Programming
Observability
Networking
Problem Solving

Education

Formal training in reliability and enterprise architecture

Tools

Grafana
Dynatrace
Docker
Kubernetes
Jenkins

Job description

Social network you want to login/join with:

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.

As a Lead Site Reliability Engineer at JPMorgan Chase within the Risk Technology Team, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on technical and business issues. You will lead resiliency design reviews, break down complex problems into manageable tasks for engineers, act as a technical lead for medium to large-sized products, and provide mentorship to other engineers.

Job responsibilities

  1. Demonstrate and champion site reliability culture and practices, exerting technical influence across your team.
  2. Lead initiatives to improve the reliability and stability of applications and platforms, utilizing data-driven analytics to enhance service levels.
  3. Collaborate with team members to define service level indicators, establish service level objectives, and set error budgets with stakeholders.
  4. Maintain high technical expertise in one or more domains, proactively identifying and resolving technology bottlenecks.
  5. Serve as the main contact during major incidents, demonstrating skills to quickly identify and resolve issues to prevent financial losses.
  6. Document and share knowledge within the organization through internal forums and communities of practice.

Required qualifications, capabilities, and skills

  1. Formal training or certification in reliability, scalability, performance, security, enterprise architecture, and toil reduction; proficient in advanced experience.
  2. Fluency in at least one programming language such as Python, Java Spring Boot, or Unix Shell.
  3. Deep knowledge of software applications and technical processes, with emerging expertise in one or more technical disciplines.
  4. Proficiency in observability tools such as Grafana, Geneos, Dynatrace, Prometheus, Datadog, Splunk, including monitoring, SLO alerting, and telemetry collection.
  5. Experience with CI/CD tools like Jenkins, GitLab, Terraform.
  6. Experience with containerization and orchestration tools such as ECS, Kubernetes, Docker.
  7. Experience troubleshooting networking technologies and issues.
  8. Ability to analyze and solve problems involving complex data structures and algorithms.
  9. Self-motivated to learn and evaluate new technologies, with the ability to teach programming languages to team members.
  10. Ability to collaborate across different stakeholder levels and groups.
  11. Working knowledge of Apache, Tomcat, and TomEE.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead Site Reliability Engineer

ZipRecruiter

Glasgow

On-site

GBP 70,000 - 110,000

9 days ago

Principal Platform Engineer

Halfords

Portsmouth

Remote

GBP 70,000 - 75,000

4 days ago
Be an early applicant

Lead Site Reliability Engineer

J.P. MORGAN

Scotland

On-site

GBP 70,000 - 90,000

30+ days ago

Lead Technical Safety Engineer

Oil, Gas and Renewables

Glasgow

Hybrid

GBP 45,000 - 85,000

30+ days ago

Cloud Engineer - Site Reliability Engineering (SRE) Lead

Aviva plc

Norwich

On-site

GBP 65,000 - 75,000

8 days ago

Cloud Engineer - Site Reliability Engineering (SRE) Lead

Aviva plc

Norwich

On-site

GBP 65,000 - 75,000

8 days ago

Cloud Engineer - Site Reliability Engineering (SRE) Lead

Aviva

Glasgow

On-site

GBP 65,000 - 75,000

9 days ago

Principal Product Safety Engineer

ZipRecruiter

Prestwick

Hybrid

GBP 50,000 - 80,000

22 days ago

Principal Product Safety Engineer

s1jobs

Prestwick

Hybrid

GBP 60,000 - 90,000

26 days ago