Enable job alerts via email!

Senior AI Engineer

Veridox

Remote

GBP 100,000 - 125,000

Full time

14 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

An AI-focused tech company in the UK is seeking a Senior AI Engineer driven by performance to lead the development of LLM and RAG pipelines. The role requires building efficient AI features and a strong statistical evaluation background. Candidates should demonstrate proficiency in Python and prior experience in production systems. This is a remote position aimed at professionals who are passionate about applying AI to impactful solutions within fraud detection.

Qualifications

Proven experience in LLM and RAG engineering in production is essential.
Strong understanding of statistical evaluation and evaluation metrics.
Excellent communicator with the ability to troubleshoot and resolve issues.

Responsibilities

Lead the development and optimisation of LLM and RAG pipelines.
Curate and own the Golden Dataset for model evaluation.
Automate evaluation processes and track key performance indicators.

Skills

Building LLM/RAG pipelines

Statistical evaluation

Communication skills

Understanding of unit economics

Experience with Python

Tools

AWS

OpenSearch

CI/CD pipelines

Role

Senior AI Engineer (LLMOps & RAG)

Location

Remote (UK-based preferred)

Type

Full-time

Compensation

Competitive

About Veridox

Veridox is an AI-driven fraud detection platform purpose-built for insurers. We combine document analysis with contextual intelligence to output detailed risk analysis. We have a high focus on trust, accuracy and explainability. As part of our growing team, you’ll play a key role in scaling the technical vision that powers our platform.

The Role

We’re looking for a hands‑on, delivery‑first engineer to lead the development and optimisation of our LLM and RAG pipelines. This isn’t a research role. You’ll be responsible for building, benchmarking, and deploying high‑performance, cost‑efficient AI features that work, and improve, in production.

We’re not looking for 100-page white papers. We’re looking for someone who can ship features, track performance, and find novel solutions to customers problems.

What You’ll Do

Build and optimise RAG pipelines using AWS Bedrock, OpenSearch, and vector stores
Own our “Golden Dataset”, curating the truth‑set we use to evaluate model output
Automate evaluation using tools like RAGAS, DeepEval, or custom “LLM‑as‑a‑judge” logic
Track drift, hallucination, and cost using observability tooling (Arize, Phoenix, etc.)
Design self‑improving systems where user interaction data flows back into future retrieval / ranking
Balance cost and performance by selecting the right model for the right task (Claude, SLMs, or whatever gets the job done)
Write clean and fast Python and ship infrastructure as code

Tech Stack

If your experience is a mix‑and‑match of a selection of the below platforms and technologies, we'd like to hear from you.

Languages: Python, TypeScript, HCL
Vector & Search: OpenSearch, AWS S3 Vectors
Observability & Evaluation: Arize, Phoenix, RAGAS, DeepEval
Infrastructure: AWS Step Functions, Azure Function Apps
DevOps: CI / CD pipelines (BitBucket)

What We’re Looking For

Proven experience building LLM / RAG pipelines in production
Confidence in statistical evaluation (sample sizes, regression testing)
Ability to define evaluation metrics and continuously improve model outputs
Strong understanding of unit economics in LLM systems (token cost, latency, accuracy trade‑offs)
Clear communicator who can flag blockers early and ship fast

Nice‑to‑Have

Experience with AWS S3 vector store or similar
Familiarity with AI‑driven fraud detection, legal tech, or investigative tools
Prior work with small language models (7B–8B) for cost‑effective inference

Why Join Us?

You will work on a system where evaluation is central to the product. You’ll have the autonomy to define standards for building, measuring, and improving complex AI systems.

If you care about rigour, impact, and building things that matter : we’d love to hear from you.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs