Enable job alerts via email!

Site Reliability Engineer (UK)

WALT Labs

City Of London

On-site

GBP 50,000 - 70,000

Full time

18 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading managed service provider in the UK is seeking a Site Reliability Engineer to maintain cloud infrastructure focusing on Google Cloud Platform (GCP). You will manage incidents, support clients, and ensure operational efficiency. The role requires strong problem-solving skills and experience with GCP and related tools. The position offers private medical insurance and professional development opportunities.

Benefits

Private Medical Insurance
Paid Time Off that increases with longevity
Professional development and advancement opportunities
Pension
Growth opportunities

Qualifications

  • Proven experience with Google Cloud Platform (GCP) services - 3+ years.
  • Strong troubleshooting and problem-solving skills in cloud environments.
  • Certifications in GCP (e.g., Google Cloud Associate or Professional certifications) are highly desirable.

Responsibilities

  • Provide technical support and resolve issues related to GCP services.
  • Manage and respond to cloud incidents using incident.io, ensuring timely resolution.
  • Monitor and maintain cloud infrastructure for performance, reliability, and security.

Skills

Google Cloud Platform (GCP) services
Kubernetes
incident.io
JIRA
Grafana
DataDog
cloud security best practices
Python
Terraform
communication skills

Education

Bachelor’s degree in Computer Science, Information Technology, or related field

Tools

incident.io
JIRA
Google Workspace
Job description

WALT Labs, a leading managed service provider, is dedicated to empowering businesses by harnessing the power of cloud technology. Our team specializes in delivering customized solutions tailored to meet the unique needs of our clients, driving growth and operational efficiency across industries. From supporting small businesses with seamless data migration to enabling large corporations to manage complex infrastructure projects, we provide exceptional service while staying at the forefront of cloud technology advancements.

We are seeking a skilled Site Reliability Engineer - UK with a strong focus on Google Cloud Platform (GCP) to join our dynamic team. In this role, you’ll be responsible for maintaining cloud infrastructure, managing incidents, and ensuring seamless operations for our clients. You’ll use tools like incident.io and JIRA to manage and resolve support requests efficiently.

This is an in-office role: Monday - Friday, 9 AM - 6 PM GMT / BST

Qualifications for Site Reliability Engineer:
  • Proven experience with Google Cloud Platform (GCP) services - 3+ years. (Kubernetes a must!)
  • Understanding of Google Workspace (admin experience a plus)
  • Familiarity with incident.io for incident tracking and management (of equivalent)
  • Proficiency in using JIRA for task management and support workflows.
  • Strong experience working with observability tools (Grafana and DataDog)
  • Strong troubleshooting and problem-solving skills in cloud environments.
  • Understanding of cloud security and performance optimization best practices.
  • Knowledge of scripting or automation tools (e.g., Python, Terraform) is a plus.
  • Excellent written communication and customer service skills.
  • Certifications in GCP (e.g., Google Cloud Associate or Professional certifications) are highly desirable.
  • Ability to work under pressure and prioritize tasks effectively.
  • Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).
Role responsibilities
  • Provide technical support and resolve issues related to Google Cloud Platform (GCP) services and AWS.
  • Provide client support for Google Workspace
  • Manage and respond to cloud incidents using incident.io, ensuring timely resolution.
  • Use JIRA to log, track, and prioritize support tickets and workflow tasks.
  • Monitor and maintain cloud infrastructure for performance, reliability, and security.
  • Collaborate with teams to identify and implement solutions to technical challenges.
  • Assist in deploying, configuring, and optimizing GCP resources.
  • Create and maintain documentation for troubleshooting processes and best practices.
  • Proactively identify opportunities to improve cloud environments and support processes.
  • Support clients and stakeholders by providing clear communication and updates during incident resolution.
  • Stay up-to-date with the latest GCP developments and contribute to team knowledge sharing.
Benefits
  • Private Medical Insurance
  • Paid Time Off that increases with longevity (additional 1.5 days every 2 years)
  • Professional development and advancement opportunities
  • Pension
  • Growth opportunities
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.