Job Search and Career Advice Platform

Enable job alerts via email!

SC Cleared Data Scientist/ML Engineer - Generative AI, Python, LLM

fortice

Remote

GBP 50,000 - 70,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A UK technology consultancy is seeking an experienced Contract Data Scientist/ML Engineer for a remote role on a major AI project. Candidates will analyze and transform complex datasets, design datasets for LLMs, and implement evaluations using modern techniques. Strong Python skills and experience in LLM data preparation are essential, as is the ability to communicate technical findings clearly. This role requires active SC clearance and offers flexibility with periodic UK travel.

Qualifications

  • Strong Python engineering skills for exploratory and production-ready tasks.
  • Experience with data analysis and EDA techniques.
  • Expertise in preparing data for LLMs, including prompt engineering and embeddings.

Responsibilities

  • Analyse and structure complex datasets for machine-readable formats.
  • Design and optimise datasets and retrieval strategies.
  • Implement and evaluate embeddings-based search using databases.

Skills

Python engineering skills
EDA and data analysis
LLM data preparation
Data quality validation
Clear communication of technical findings
Familiarity with AWS

Tools

FAISS
Weaviate
Chroma
Job description

I'm partnering with a specialist UK technology consultancy to support the hire of an experienced Contract Data Scientist/ML Engineer for a major Generative AI project within a secure public‑sector environment. This is an opportunity to work on high‑impact AI initiatives, helping to redesign complex human‑driven processes through LLMs and advanced retrieval systems. The work is fully remote with periodic UK travel, and active SC clearance is essential. This role is inside iR35 and fully remote.

Key Responsibilities
  • Analyse, structure, and transform complex, messy datasets into machine‑readable formats suitable for LLMs.
  • Design and optimise RAG datasets, embeddings pipelines, and retrieval strategies.
  • Implement and evaluate embeddings‑based search using vector databases.
  • Conduct robust EDA, data quality assessment, and anomaly detection.
  • Translate manual human processes into clear, machine‑interpretable logic for GenAI integration.
  • Deliver modular, production‑ready Python code with minimal oversight.
  • Evaluate LLM and RAG system performance using modern metrics and techniques.
Technical Skills
  • Strong Python engineering skills (exploratory + production‑ready).
  • Comprehensive EDA and data analysis capability.
  • Expertise in LLM data preparation, including:
    • Prompt engineering fundamentals.
    • Embeddings and vector databases (FAISS, Weaviate, Chroma).
    • RAG dataset design and retrieval optimisation (chunking strategies, hybrid search, re‑ranking).
    • Evaluation techniques for RAG (retrieval scoring, LLM-as-a-judge, hallucination checks).
  • Ability to convert unstructured, ambiguous data into structured, validated datasets.
  • Strong understanding of data quality, validation, and documenting assumptions.
  • Clear communication of technical findings to both technical and non‑technical audiences.
  • Familiarity with AWS is beneficial.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.