Enable job alerts via email!

ML/AI Engineer

Lloyds Banking Group

Manchester

Hybrid

GBP 70,000 - 85,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading banking institution is seeking an experienced ML/AI Engineer to join their Data & AI Engineering team in Manchester. You'll be responsible for building and maintaining scalable systems that support the machine learning lifecycle, focusing on Kubernetes orchestration and GPU optimization. This role offers a salary between £70,929 and £85,000 per annum, a diverse working environment, and a hybrid work model that emphasizes collaboration and innovation.

Benefits

Generous pension contribution up to 15%

Annual bonus award

Share schemes

Discounted shopping

30 days' holiday

Wellbeing initiatives

Qualifications

Strong Python skills for automation and tooling.
Hands-on experience with Kubernetes for model serving.
Expertise in CI/CD pipelines and GitOps workflows.

Responsibilities

Build and operate Kubernetes clusters for model inference.
Implement GitOps workflows and CI/CD pipelines.
Deploy and tune GPU-backed inference services.

Skills

Strong Python

Deep expertise in Kubernetes

CI/CD expertise

Practical experience with CUDA

Proficiency in Prometheus

Experience operating MLflow

Expert use of Git

Tools

Kubernetes

Docker

Helm

TensorRT

TorchServe

JOB TITLE: ML/AI Engineer

SALARY: £70,929 - £85,000 per annum

LOCATION: Manchester

HOURS: Full-time - 35 hours

WORKING PATTERN: Our work style is hybrid, which involves spending at least two days per week, or 40% of our time, at our Manchester office.

About this opportunity

Exciting opportunity for a hands‑on ML/AI Engineer to join our Data & AI Engineering team. You'll build, automate, and maintain scalable systems that support the full machine learning lifecycle. You will lead Kubernetes orchestration, CI/CD automation (including Harness), GPU optimisation, and large scale model deployment, owning the path from code commit to reliable, monitored production services

This is a unique opportunity to shape the future of AI by embedding fairness, transparency, and accountability at the heart of innovation. You'll join us at an exciting time as we move into the next phase of our transformation. We're looking for curious, passionate engineers who thrive on innovation and want to make a real impact.

About us

We're on an exciting journey and there could't be a better time to join us. The investments we're making in our people, data, and technology are leading to innovative projects, fresh possibilities, and countless new ways for our people to work, learn, and thrive.

What you'll do

Compose, build, and operate production grade Kubernetes clusters for high volume model inference and scheduled training jobs.
Configure autoscaling, resource quotas, GPU/CPU node pools, service mesh, Helm charts, and custom operators to meet reliability and efficiency targets.
Implement GitOps workflows for environment configuration and application releases.
Build CI/CD pipelines in Harness (or equivalent) to automate build, test, model packaging, and deployment across environments (dev / pre prod / prod).
Enable progressive delivery (blue/green, canary) and rollback strategies, integrating quality gates, unit/integration tests, and model evaluation checks.
Standardise pipelines for continuous training (CT) and continuous monitoring (CM) to keep models fresh and safe in production.
Deploy and tune GPU backed inference services (e.g., A100), optimise CUDA environments, and leverage TensorRT where appropriate.
Operate scalable serving frameworks (NVIDIA Triton, TorchServe) with attention to latency, efficiency, resilience, and cost.
Implement end to end observability for models and pipelines: drift, data quality, fairness signals, latency, GPU utilisation, error budgets, and SLOs/SLIs via Prometheus, Grafana, and Dynatrace.
Establish actionable alerting and runbooks for on call operations; drive incident reviews and reliability improvements.
Operate a model registry (e.g., MLflow) with experiment tracking, versioning, lineage, and environment specific artefacts.
Enforce audit readiness: model cards, reproducible builds, provenance, and controlled promotion between stages

What you'll need

Strong Python for automation, tooling, and service development.
Deep expertise in Kubernetes, Docker, Helm, operators, node pool management, and autoscaling.
CI/CD expertise having hands on experience with Harness (or similar) building multi stage pipelines; experience with GitOps, artefact repositories, and environment promotion.
Practical experience with CUDA, TensorRT, Triton, TorchServe, and GPU scheduling/optimisation.
Proficiency in Prometheus, Grafana, Dynatrace defining SLIs/SLOs and alert thresholds for ML systems.
Experience operating MLflow (or equivalent) for experiment tracking, model bundling, and deployments.
Expert use of Git, branching models, protected merges, and code review workflows.

It would be great if you had any of the following

Experience with GCP (e.g., GKE, Cloud Run, Pub/Sub, BigQuery) and Vertex AI (Endpoints, Pipelines, Model Monitoring, Feature Store).
Hooks for prompt/version management, offline/online evaluation, and human in the loop workflows (e.g., RLHF) to enable continuous improvement.
Familiarity with Model Context Protocol (MCP) for tool interoperability, plus Google ADK, LangGraph/LangChain for agent orchestration and multi agent patterns.
Ray, Kubeflow, or similar frameworks.
Experience embedding controls, audit evidence, and governance in regulated environments.
Experience with GPU efficiency, autoscaling strategies, and workload right sizing.

About working for us

Our focus is to ensure we're inclusive every day, building an organisation that reflects modern society and celebrates diversity in all its forms. We want our people to feel that they belong and can be their best, regardless of background, identity or culture. And it's why we especially welcome applications from under‑represented groups. We're disability confident. So if you'd like reasonable adjustments to be made to our recruitment processes, just let us know.

We also offer a wide-ranging benefits package, which includes

A generous pension contribution of up to 15%
An annual bonus award, subject to Group performance
Share schemes including free shares
Benefits you can adapt to your lifestyle, such as discounted shopping
30 days' holiday, with bank holidays on top
A range of wellbeing initiatives and generous parental leave policies

Want to do amazing work, that's interesting and makes a difference to millions of people? Join our journey!

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs