Enable job alerts via email!

Site Reliabilty Engineer

Stratospherec Ltd

United Kingdom

Remote

GBP 60,000 - 80,000

Full time

2 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Start fresh or import an existing resume

Job summary

A leading global SaaS company is seeking a Senior Site Reliability Engineer to join their UK Cloud Infrastructure team, offering a competitive salary and the chance to work remotely. In this role, you'll ensure the reliability of SaaS products and enhance system performance through software engineering principles. As part of a high-performance team, you will lead projects aimed at continuous improvement in a supportive and innovative environment.

Benefits

Excellent pension scheme
Life insurance
Generous holiday allowance
Supportive learning environment

Qualifications

  • Experience in SRE/DevOps/Platform Engineering with .NET and C#.
  • Proficiency in scripting languages such as Bash, Python, or PowerShell.
  • Experience with monitoring and logging tools like DataDog or Prometheus.

Responsibilities

  • Lead technical initiatives including mentoring and solution design.
  • Implement strategies to improve platform reliability and incident response.
  • Develop and maintain automations and Infrastructure as Code.

Skills

C#
Cloud Computing
DevOps Practices
Automation
Containerization

Tools

Kubernetes
Docker
Terraform
AWS
Azure

Job description

Site Reliability Engineer

Fully remote work for UK Citizens based in the UK – Salary to £80k + Benefits

We are looking for a Site Reliability Engineer/DevOps Engineer with a background in .NET software development and strong C# skills. The candidate should also have knowledge of DevOps tools like Kubernetes and/or Docker, and experience with Azure or AWS cloud platforms. This role involves supporting a global SaaS platform within a growing cloud infrastructure team.

Our client, a global digital SaaS software company, offers a fully remote opportunity for an experienced Senior Site Reliability Engineer/DevOps Engineer to join their UK Cloud Infrastructure team.

Senior SREs are responsible for ensuring the reliability of SaaS products, applying principles of software and systems engineering to improve system reliability while minimizing manual work. They should be experienced in software engineering, operational discipline, and automation.

The team works remotely across the US and Australia, supporting a market-leading student community management SaaS platform used by millions of university students worldwide.

In this role, you will use your software engineering experience to enhance system performance and reliability, develop internal systems, and automate manual processes. You will join the Platforms team, working in a "follow the sun" model across multiple regions.

Role Responsibilities:
  1. Provide technical leadership and mentorship through knowledge sharing, pair programming, code reviews, and solution design.
  2. Identify and implement technical solutions to improve platform reliability, including mitigation strategies and operational playbooks.
  3. Implement and maintain monitoring, alerting, and logging systems to identify and respond to incidents.
  4. Ensure scalability and efficiency of cloud infrastructure and systems to handle traffic and data growth.
  5. Conduct performance testing to identify and resolve bottlenecks.
  6. Develop and maintain platform solutions, automate infrastructure provisioning, configuration, and management tasks using Infrastructure as Code.
  7. Monitor, review, and tune databases to ensure high availability and performance.
  8. Collaborate with product engineering teams to design and build observable software.
Required Skills and Experience:
  • Proven experience in an SRE/DevOps/Platform Engineering role, with a background in software engineering using .NET and C#.
  • Proficiency in C# and scripting languages like Bash, Python, or PowerShell.
  • Experience with containerization technologies such as Kubernetes and Docker.
  • Proficiency with cloud providers like Azure, AWS, or GCP.
  • Experience with Infrastructure as Code tools such as Terraform (preferred), Ansible, or CloudFormation.
  • Experience with monitoring and logging tools like DataDog, Prometheus, Grafana, or similar.
  • A track record of maintaining highly available and performant production environments.
  • Ability to develop effective mitigation strategies and operational playbooks.
Useful/Bonus Skills:
  • Experience with CI/CD tools like Azure DevOps, GitHub Actions, or Octopus Deploy.
  • Relevant certifications such as Microsoft Certified: Azure Solutions Architect or Certified Kubernetes Administrator are a plus.
  • Experience in database management and performance tuning, especially MSSQL.
Employee Benefits:
  • Opportunity to be part of a well-established, high-performance SaaS company with over 30 years of history.
  • Excellent pension scheme and life insurance.
  • Generous holiday allowance.
  • Supportive environment emphasizing learning and development.
  • Work with a passionate, high-performing team committed to innovation and continuous improvement.

This role is part of a large program of change and improvement for a market-leading global SaaS company. If you're seeking an interesting SRE role within a forward-thinking organization, this could be a tremendous career opportunity. Please apply with your CV to find out more.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.