Enable job alerts via email!

Machine Learning Infrastructure Engineer

ZipRecruiter

City Of London

On-site

GBP 100,000 - 130,000

Full time

Today

Be an early applicant

Job summary

A pioneering AI startup in London seeks an experienced candidate to own the ML infrastructure. The role involves designing scalable cloud architecture and managing GPU clusters. Ideal applicants will have a strong background in building ML systems from scratch and will work closely with researchers to optimize workflows. The salary range is £100k–£130k, flexible for strong profiles.

Qualifications

Experience in building ML infrastructure and cloud architecture from scratch.
Familiarity with designing and deploying scalable, high-performance cloud infrastructure.
Proficiency in setting up and optimizing containerized workflows.

Responsibilities

Design and deploy scalable cloud infrastructure for ML workloads.
Build and manage GPU clusters and distributed training environments.
Implement monitoring, incident response, and CI/CD practices.

Skills

Building ML infrastructure

Cloud architecture

Docker

Kubernetes

Terraform

Python

Tools

AWS

GCP

Azure

MLflow

Prometheus

Grafana

Overview

Do you want to own the ML infrastructure at a frontier AI startup?

Have you built cloud and ML systems from scratch, not just maintained them?

Are you ready to shape the backbone of 3D generative AI?

SpAItial is pioneering the development of a frontier 3D foundation model, combining cutting-edge AI, computer vision, and spatial computing to redefine how industries — from robotics and AR/VR to gaming and film — generate and interact with 3D content. Backed by £13m in seed funding, with half allocated to compute, SpAItial is a 10-person research-focused team moving fast towards a public demo later this year.

Responsibilities

Design and deploy scalable, high-performance cloud infra for ML workloads
Build and manage GPU clusters, storage systems, and distributed training environments
Set up and optimise containerised workflows (Docker, Kubernetes, Terraform)
Implement robust monitoring, incident response, and CI/CD practices
Collaborate closely with researchers to integrate and scale experiments

This person must have experience building ML Infrastructure and cloud architecture from scratch

Key Details

Salary: £100k–£130k (flexible for strong profiles)
Working Model: On-site, London
Tech Stack: AWS/GCP/Azure, Kubernetes, Docker, Terraform, Python, MLflow/Prometheus/Grafana

If you want to shape the backbone of one of Europe’s most ambitious AI startups, we’d love to hear from you.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Machine Learning Infrastructure Engineer

ZipRecruiter

City Of London

On-site

GBP 100,000 - 130,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Tools

Company

Services

Free resources

Support

Machine Learning Infrastructure Engineer

ZipRecruiter

City Of London

On-site

GBP 100,000 - 130,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Tools

Follow us

Company

Services

Free resources

Support