Enable job alerts via email!

Platform Engineer

CATCHES

Remote

GBP 80,000 - 100,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A luxury fashion tech company is seeking a Platform Engineer to manage and optimise next-generation high-performance computing infrastructure. Responsibilities include automating GPU cluster provisioning, maintaining Linux environments for AI/ML workloads, and optimizing for containerized workloads. This fully remote position offers a high-trust environment that values innovation and creativity in tech and fashion. Ideal candidates will have a strong background in Linux Systems Administration and experience with IaC tools like Terraform and Ansible.

Benefits

Co-working allowances

Remote-first environment

High-trust culture

Qualifications

Strong background in Linux Systems Administration.
Experience managing Bare Metal servers (on-premise or packet/equinix metal).
Proficiency in Infrastructure as Code (IaC) tools.

Responsibilities

Automate provisioning and lifecycle of GPU clusters using Terraform and Ansible.
Maintain stability of large-scale Linux environments supporting AI/ML workloads.
Collaborate to troubleshoot hardware and networking issues.

Skills

Linux Systems Administration

Infrastructure as Code (IaC) tools

High-throughput networking

Tools

Terraform

Ansible

Prometheus

Grafana

Kubernetes

Docker

About

Backed by some of the most influential names in luxury fashion globally. We blend advanced 3D rendering, AI and VFX techniques to deliver unparalleled shopping experiences for luxury fashion.

Role

We are hiring a Platform Engineer to manage and optimise our next-generation high-performance computing infrastructure. Move beyond standard cloud instances and manage the raw power of bare metal GPU clusters.

Responsibilities

Automate the provisioning and lifecycle of high-performance GPU clusters using Terraform and Ansible.
Maintain the stability and performance of large-scale Linux environments supporting AI / ML training workloads.
Collaborate with vendors and internal teams to troubleshoot hardware and networking bottlenecks (latency, throughput).
Implement monitoring solutions (Prometheus / Grafana) to visualise GPU health and cluster efficiency.
Assist in optimising the stack for containerised workloads (Kubernetes / Docker).

Requirements

Strong background in Linux Systems Administration.
Experience managing Bare Metal servers (on-premise or packet / equinix metal).
Proficiency in Infrastructure as Code (IaC) tools.
Nice to have : Exposure to GPUs, InfiniBand, or high-throughput networking (we will train the right candidate).

What working with CATCHES is like

Fully remote-first, async-friendly, with optional co-working allowances.
High-trust, low-bureaucracy environment that values experimentation and shipping.
Early influence on product, architecture and engineering culture.
Cutting-edge tech, luxury-fashion creativity, and games-industry scale challenges combined.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs