Enable job alerts via email!

Site Reliability Engineer

Capital on Tap

City Of London

Hybrid

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Job summary

A dynamic fintech company in London is seeking a Site Reliability Engineer to ensure the reliability and performance of their platforms. You will design, build, and monitor systems, improving uptime and collaborating cross-functionally. The ideal candidate has a strong background in Azure, Kubernetes, and CI/CD tools, along with excellent communication skills. This hybrid role allows for work from home while connecting with a vibrant office culture.

Benefits

Private Healthcare
Annual Learning and Wellbeing Budget
28 days holiday

Qualifications

  • Experience managing a public cloud, preferably Azure.
  • Proficient in Infrastructure as Code (IaC) with Terraform and Pulumi.
  • Excellent collaboration in an agile environment essential.

Responsibilities

  • Ensure reliability and availability of platforms.
  • Design and build monitoring systems for uptime.
  • Collaborate with Platform teams for reliable application solutions.

Skills

Azure management
Kubernetes
Linux systems
Scripting languages
Communication skills

Tools

Terraform
Datadog
Docker
CI/CD tools
Job description
Overview

We’re Capital on Tap. Capital on Tap was founded with the mission to help small business owners and make their lives easier. We provide an all-in-one business credit card & spend management platform that helps business owners save time and money. We proudly serve over 200,000 businesses worldwide with a goal to help 1 million small businesses by 2030.

Why Join Us? We empower you to be innovative and solve complex problems. Take ownership, make an impact, and thrive in our scaling and agile environment. This is a hybrid role; the SRE team works from our London (Shoreditch) offices 1-2 days per week.

SRE at Capital on Tap

SRE at Capital on Tap uses a hybrid embedded SRE model. We work closely with teams across Capital on Tap to provide the best support. Our primary objective is to gain visibility into our platform’s health while offering scalable solutions.

What You’ll Be Doing

As a Site Reliability Engineer (SRE), you will ensure the reliability, performance, and availability of our platforms. Your responsibilities include designing, building, and monitoring systems to maximise uptime and efficiency, collaborating with Platform teams to build reliable, scalable applications, proactively addressing outages and performance issues with structured monitoring and alerting, and determining feature launches by defining required reliability through SLAs, SLIs, and SLOs.

  • Manage and automate Azure, Datadog, NGINX & Cloudflare
  • Develop and monitor Kubernetes and Serverless resources
  • Maintain infrastructure code with Terraform & CRDs / Crossplane
  • Improve systems and processes; consult stakeholders to enhance platform performance
  • Participate in new application architecture & design processes
  • Design solutions to reduce toil
  • Create SLIs and SLOs; increase application visibility
  • Align with the Product team on SLAs and core service objectives
  • Collaborate with Platform Engineers for automated solutions and pipelines
  • Enhance user experience with infrastructure and pipeline optimization
  • Support CI/CD tools such as Azure DevOps, Octopus Deploy and Flux
  • Lead incident troubleshooting to safeguard customer experience
  • Experience in managing a public cloud (Azure advantageous)
  • Experience in Azure DevOps, Octopus, Flux or other CI/CD tools
  • Experience in Go (preferred), PowerShell (preferred), Python, C# or other scripting languages
  • Experience with Linux and Microsoft Systems
  • Excellent communication skills and ability to collaborate with multiple teams in an agile environment
  • Proficient in IaC technologies with Terraform and Pulumi
  • Experience with cloud monitoring solutions (Datadog advantageous)
  • Experience with Kubernetes and Docker
  • Experience with Chaos Engineering practices
  • Experience with IDPs and software cataloguing
  • Experience with observability and tracing best practices
Diversity & Inclusion

We welcome, consider and encourage applications from anyone who shares our commitment to inclusivity. Join us in creating a space where authenticity thrives and everyone can do their best work.

Great Work Deserves Great Perks

We foster a fun office culture with amenities such as a pool table, arcade machine, beer tap, and office dogs. Check out our benefits:

  • Private Healthcare including dental and opticians through Vitality
  • Worldwide travel insurance through Vitality
  • Anniversary Rewards (£250, £500, £750, 4-week fully paid sabbatical)
  • Salary Sacrifice Pension Scheme up to 7% match
  • 28 days holiday (plus bank holidays)
  • Annual Learning and Wellbeing Budget
  • Enhanced Parental Leave
  • Cycle to Work Scheme
  • Season Ticket Loan
  • 6 free therapy sessions per year
  • Dog Friendly Offices
  • Free drinks and snacks in our offices
Interview Process

First stage: 30-minute intro and values call with Talent Partner (Video call). Second stage: 45-minute CV overview with Head of department & Engineering Team Leads and/or PM (Video call). Final stage: 60-minute questions and scenario-based interview with SRE Team Lead (Video call).

Apply

Excited to work here? Apply via this posting. We aim to respond within 3 working days (up to 5 during busy periods).

Check out more about our benefits, values and mission on the career page.

Application Details

First Name, Last Name, Email, Phone, Country, Resume/CV, LinkedIn, How you heard about this job, Salary expectations, Notice period, Visa sponsorship, Equality data (voluntary), and consent to data handling will be collected as part of the application process. All questions are optional and will not affect your application if you choose not to answer.

UK DE&I Data

We are committed to equality and diversity. Data collected here is anonymous and optional and will not affect your application.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.