Enable job alerts via email!

Search - Workchat - Senior Data Scientist

Elastic

Camden Town

On-site

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A leading search AI company is seeking a Senior Data Scientist to design evaluation pipelines and drive product decisions. The ideal candidate should have 5-8 years of experience in applied Data Science/Machine Learning with a focus on Information Retrieval and Natural Language Processing. This role involves collaborating with cross-functional teams to enhance product outcomes in a dynamic environment.

Qualifications

  • Experience in applied DS/ML with strong IR/NLP experience.
  • Experience turning experimental results into product calls.

Responsibilities

  • Design and maintain evaluation pipelines.
  • Collaborate with engineers and product teams.
  • Use evaluation metrics for product improvements.
  • Mentor peers in evaluation craft.

Skills

Python
PyTorch/Transformers
Pandas
Elasticsearch
Deep Learning
NLP
Information Retrieval

Education

5-8 years in applied Data Science/Machine Learning
Job description
Overview

The Search Conversational Experiences team builds Elastic's new conversational (agentic) platform that lets customers chat with their own data in Elasticsearch. We own the quality layer for RAG, agents and tools, retrieval/citations, streaming, memory, and the evaluation signals that turn open-ended questions into grounded, reliable answers. As a Senior Data Scientist, you'll be part of a cross-functional team (backend, DS, PM, UX) driving chat quality end-to-end: designing and running evaluation pipelines, improving prompts and tool behaviors, and turning measurements into product decisions that customers can feel. You'll help tackle frontier problems—folding RAG and vector search into an agent's knowledge base, dynamically enriching model context to boost groundedness, shaping agent routing and tool selection policies, lighting up agent-driven visualizations on top of Elasticsearch data, and exploring multimodality and reasoning strategies where they truly move the needle. This is an applied role: you will prototype, evaluate, and partner with engineers to ship.

Responsibilities
  • Design and maintain offline/online evaluation pipelines for conversational search: golden sets, rubric/LLM-as-judge calibration, groundedness/citation checks, and A/B tests.
  • Build and compare retrieval & re-ranking baselines (sparse + dense), query understanding, and semantic rewrites; land improvements with clear metrics.
  • Use results to drive product decisions: model selection, efficient agent routing, tool gating, and agent customization for Elastic use cases in search and beyond.
  • Instrument dashboards and telemetry so helpfulness, faithfulness, latency, and cost trade-offs are visible and trustworthy; guard against regressions in CI.
  • Collaborate tightly with backend engineers on contracts (ES|QL, citations, telemetry), and with PM/UX to translate findings into shipped features.
  • Share outcomes clearly (docs, notebooks, PRs) and mentor peers in experiment design and evaluation craft.
  • Proficiency in Python, PyTorch/Transformers, Pandas; reproducible experiments (e.g., MLflow), versioned datasets, and clean, reviewable code.
  • Hands-on evaluation expertise: offline metrics (nDCG/MRR/Recall@k), LLM-as-judge calibration, groundedness/citation scoring, and online A/B testing.
  • Experience turning experimental results into clear product calls (models, routing, tools) and communicating them crisply to cross-functional partners.
  • Practical Elasticsearch experience (or similar); ES|QL familiarity is a plus.
  • Comfort working in a distributed, async-first environment; strong written communication; low-ego collaboration.
Qualifications
  • 5-8 years in applied DS/ML with strong IR/NLP experience (RAG, dense/sparse retrieval, re-ranking, vector search).
  • Experience turning experimental results into clear product calls (models, routing, tools) and communicating them crisply to cross-functional partners.
Company

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data - securing and protecting private information more effectively - Elastic's complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.