Enable job alerts via email!

Senior Site Reliability Engineer

UnlikelyAI

City Of London

Hybrid

GBP 75,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A cutting-edge AI company in the UK is seeking a Senior Site Reliability Engineer to enhance system reliability and scalability. This role involves designing fault-tolerant infrastructure on AWS, optimizing CI/CD pipelines, and working closely with a multidisciplinary team. Candidates should have a strong background in backend development, AWS, and Python, alongside a collaborative mindset. The position offers a hybrid work model with team lunches and various social activities.

Benefits

Free team lunches
Hybrid working arrangement
Optional social activities
Annual offsite

Qualifications

  • Proven experience running production systems at scale.
  • Strong backend development and AWS expertise.
  • Fluency in Python and solid CS fundamentals.
  • Track record of architecting secure, reliable systems.
  • Passion for developer-driven quality and continuous improvement.
  • Pragmatic, collaborative problem-solver with sound judgement.
  • Bias for action and comfort navigating ambiguity.

Responsibilities

  • Architect, operate, and scale production systems on AWS.
  • Implement and optimise CI/CD pipelines (GitHub Actions).
  • Build automation and tooling to streamline reliability, performance, and security.
  • Design fault-tolerant systems with strong monitoring and observability.
  • Partner with engineers and scientists to embed reliability into projects from day one.
  • Champion developer-driven quality and 'shift-left' reliability practices.
  • Apply sound judgement and pragmatism in scaling and system design decisions.

Skills

Production systems experience
Backend development
AWS expertise
Fluency in Python
Security architecture
Collaborative problem-solving
Continuous improvement mindset

Tools

GitHub Actions
Job description

This job is brought to you by Jobs/Redefined, the UK's leading over-50s age inclusive jobs board.

At UnlikelyAI, we are building the future of AI: one that is reliable, accurate and transparent. Our neurosymbolic technology harnesses the power of LLMs and generative AI, and combines it with classical symbolic technology to produce hallucination-resistant artificial intelligence for high-trust applications.

The Role

We're looking for an experienced Senior Site Reliability Engineer to help us scale our systems as we move from prototypes into full production. This is a strategically important role, where you'll define and own our approach to reliability, scalability, and security.

You'll design and operate infrastructure that is fault-tolerant, automated, and observable - ensuring our technology runs smoothly in real-world, high-trust environments. Your work will span both internal and customer-facing systems, applying your engineering skills to solve complex reliability and scalability challenges.

You'll collaborate with a multidisciplinary team of engineers and scientists in a highly cross-functional environment. Our projects move quickly from idea to delivery, and our teams adapt rapidly to changing priorities - so you'll thrive in a dynamic setting where reliability and scalability need to keep pace with innovation.

This is a hands-on position with high impact, giving you the opportunity to shape both the way our systems are built and the standards that guide how we deliver cutting-edge AI into production.

What You'll Do

In this role, you'll be responsible for building the reliability foundations that make our technology production-ready:

  • Architect, operate, and scale production systems on AWS.
  • Implement and optimise CI/CD pipelines (GitHub Actions).
  • Build automation and tooling to streamline reliability, performance, and security.
  • Design fault-tolerant systems with strong monitoring and observability.
  • Partner with engineers and scientists to embed reliability into projects from day one.
  • Champion developer-driven quality and "shift-left" reliability practices.
  • Apply sound judgement and pragmatism in scaling and system design decisions.
What We're Looking For

We're looking for someone who combines technical depth with a practical, collaborative mindset:

  • Proven experience running production systems at scale.
  • Strong backend development and AWS expertise.
  • Fluency in Python and solid CS fundamentals.
  • Track record of architecting secure, reliable systems.
  • Passion for developer-driven quality and continuous improvement.
  • Pragmatic, collaborative problem-solver with sound judgement.
  • Bias for action and comfort navigating ambiguity.
What s in it for You

As a Senior Site Reliability Engineer at UnlikelyAI, you'll play a critical role in bringing our technology to life:

  • Influence reliability standards at a company scaling cutting-edge AI.
  • Work on advanced reasoning capabilities in a dynamic, cross-functional environment.
  • Join a team that values speed, impact, and collaboration.
Working at UnlikelyAI

We offer a range of benefits designed to support our team's wellbeing and work-life balance.

  • We have a hybrid working arrangement, flexibly balancing working from home and office-based working
  • Our office is located in Bloomsbury, approximately a five minute walk to both Tottenham Court Road and Holborn stations
  • We provide free team lunches every Tuesday, Wednesday and Thursday
  • We schedule a variety of optional social and extra-curricular activities
  • We have an annual offsite, usually to an international location, where we can work and socialise in the sun
Equal Opportunities

We are committed to having a truly diverse team where everyone is encouraged to be their authentic selves. We do not discriminate based on gender, race, religion, sexual orientation, national origin, political affiliation, disability, age, marital status, medical history, parental status or genetic information.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.