Job Search and Career Advice Platform

Enable job alerts via email!

Search - Agent Builder - Senior Data Scientist

Elastic

City of Westminster

On-site

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in the UK is seeking a Senior Data Scientist to join their team focused on conversational experiences. This role involves designing evaluation pipelines, improving conversational search performance, and collaborating with engineers and product teams. The ideal candidate will have 5-8 years of applicable experience in data science and machine learning, proficiency in Python, and a deep understanding of information retrieval and NLP techniques.

Qualifications

  • 5-8 years in applied DS/ML with strong IR/NLP experience.
  • Proficiency in Python and ML frameworks.
  • Hands-on evaluation expertise in metrics and A/B testing.

Responsibilities

  • Design and maintain evaluation pipelines for conversational search.
  • Build and compare retrieval and re-ranking baselines.
  • Collaborate tightly with backend engineers.

Skills

Applied Data Science/Machine Learning
Information Retrieval/NLP
Python
PyTorch/Transformers
Pandas
Elasticsearch

Tools

MLflow
Job description

The Search Conversational Experiences team builds Elastic's new conversational (agentic) platform that lets customers chat with their own data in Elasticsearch. We own the quality layer for RAG, agents and tools, retrieval/citations, streaming, memory, and‑crucially‑the evaluation signals that turn open‑ended questions into grounded, reliable answers. As a Senior Data Scientist, you'll be part of a cross‑functional team (backend, DS, PM, UX) driving chat quality end‑to‑end: designing and running evaluation pipelines, improving prompts and tool behaviors, and turning measurements into product decisions that customers can feel. You'll help tackle frontier problems‑folding RAG and vector search into an agent's knowledge base, dynamically enriching model context to boost groundedness, shaping agent routing and tool selection policies, lighting up agent‑driven visualizations on top of Elasticsearch data, and exploring multimodality and reasoning strategies where they truly move the needle. This is an applied role: you will prototype, evaluate, and partner with engineers to ship DUTIES

Responsibilities
  • Design and maintain offline/online evaluation pipelines for conversational search: golden sets, rubric/LLM‑as‑judge calibration, groundedness/citation checks, and A/B tests.
  • Build and compare retrieval & re‑ranking baselines (sparse + dense), query understanding, and semantic rewrites; land improvements with clear metrics.
  • Use results to drive product decisions: model selection, efficient agent routing, tool gating, and agent customization for Elastic use cases in search and beyond.
  • Instrument dashboards and telemetry so helpfulness, faithfulness, latency, and cost trade‑offs are visible and trustworthy; guard against regressions in CI.
  • Collaborate tightly with backend engineers on contracts (ES|QL, citations, telemetry), and with PM/UX to translate findings into shipped features.
  • Share outcomes clearly (docs, notebooks, PRs) and mentor peers in experiment design and evaluation craft.
Qualifications
  • 5-8 years in applied DS/ML with strong IR/NLP experience (RAG, dense/sparse retrieval, re‑ranking, vector search).
  • Proficiency in Python, PyTorch/Transformers, Pandas; reproducible experiments (e.g., MLflow), versioned datasets, and clean, reviewable code.
  • Hands‑on evaluation expertise: offline metrics (nDCG / MRR / Recall@k), LLM‑as‑judge calibration, groundedness/citation scoring, and online A/B testing.
  • Experience turning experimental results into clear product calls (models, routing, tools) and communicating them crisply to cross‑functional partners.
  • Practical Elasticsearch experience (or similar); ES|QL familiarity is a plus.
  • Comfort working in a distributed, async‑first environment; strong written communication; low‑ego collaboration.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.