Enable job alerts via email!

Senior/Principal Data Scientist - NLP (Remote) - United Kingdom

TN United Kingdom

London

Remote

GBP 70,000 - 90,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in the life sciences sector is seeking a Senior/Principal Data Scientist specializing in NLP. The role involves developing LLM-based agents for extracting healthcare information and requires strong expertise in NLP and machine learning. The position offers a flexible work environment and various benefits, including a personal development budget and fitness reimbursement.

Benefits

Personal development budget
Veeva charitable giving program
Fitness reimbursement
Life insurance
Pension fund

Qualifications

  • At least 4 years of experience as a data scientist or 2+ years with a Ph.D.
  • Strong knowledge of NLP, Machine Learning, and Deep Learning.
  • Experience with large language models and transformer architectures.

Responsibilities

  • Develop LLM-based agents for healthcare sector information extraction.
  • Create semantic search functionalities for user queries.
  • Collaborate with software developers and DevOps engineers.

Skills

NLP
Machine Learning
Deep Learning
Python
Collaboration

Education

Master's or Ph.D. in Computer Science
Ph.D. in Computational Linguistics

Tools

NLTK
SpaCy
Hugging Face
PyTorch
JAX
Docker
Kubernetes
Ray
Spark

Job description

Senior/Principal Data Scientist - NLP (Remote) - United Kingdom, London

Veeva is a mission-driven organization dedicated to helping our customers in Life Sciences and Regulated industries bring their products to market faster. We value doing the right thing, customer success, employee success, and speed. Our teams develop transformative cloud software, services, consulting, and data solutions to enhance efficiency and effectiveness. Veeva supports a flexible work environment—work from home, at a customer site, or in an office.

Our product connects life sciences and key stakeholders to improve research and healthcare. It offers real-time academic, social, and medical data to build comprehensive profiles, aiding our industry partners in accelerating therapeutics development and clinical trials, ultimately helping patients receive urgent care sooner.

Role Overview

You will develop LLM-based agents specialized in searching and extracting detailed healthcare sector information about Key Opinion Leaders (KOLs). This includes creating an end-to-end human-in-the-loop pipeline to analyze unstructured medical documents (academic articles, clinical guidelines, meeting notes). These agents will perform semantic searches and provide precise answers across multiple languages and disciplines, utilizing cloud infrastructure for model development and deployment. Collaboration with software developers and DevOps engineers is essential.

Key Responsibilities
  1. Adopt the latest NLP technologies and trends in your platform.
  2. Develop LLM-based agents capable of function calls and tool utilization (e.g., browsers).
  3. Apply Reinforcement Learning from Human Feedback (RLHF) methods like DPO and PPO for training LLMs based on human preferences.
  4. Design and implement pipelines to extract information from large-scale, unstructured, multi-domain, multilingual data.
  5. Create semantic search functionalities to answer user queries effectively.
  6. Develop and utilize techniques such as named entity recognition, entity linking, slot-filling, few-shot learning, active learning, question answering, and dense passage retrieval.
  7. Analyze data models per source and region, and interpret model decisions.
  8. Collaborate with data quality teams to define metrics and evaluate models qualitatively and quantitatively.
  9. Utilize cloud infrastructure for development and work with teams to deploy models into production.
Minimum Requirements
  1. At least 4 years of experience as a data scientist (or 2+ years with a Ph.D.).
  2. Master's or Ph.D. in Computer Science, AI, Computational Linguistics, or related fields.
  3. Strong knowledge of NLP, Machine Learning, and Deep Learning.
  4. Experience with large language models and transformer architectures (e.g., GPT, BERT).
  5. Familiarity with large-scale data processing, preferably in the medical domain.
  6. Proficiency in Python and NLP libraries (NLTK, SpaCy, Hugging Face).
  7. Experience with BigData frameworks (Ray, Spark) and Deep Learning frameworks (PyTorch, JAX).
  8. Experience with cloud services (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
  9. Strong collaboration and communication skills, adaptable to startup environments.
  10. Social competence, team-oriented, high energy, ambitious, and agile mindset.
Nice to Have
  • Background in Medical NLP.
  • Experience with training, fine-tuning, and serving LLMs.
  • Experience in the life/health science industry, particularly pharma.
  • Publications in AI peer-reviewed journals.
  • Production-grade development skills.
  • Leadership skills and a network for hiring and team growth.
  • Experience with NoSQL databases like MongoDB.
  • Familiarity with model registry solutions such as MLflow.
  • Experience with distributed computing platforms like Ray and Spark.
Perks & Benefits
  • Personal development budget.
  • Veeva charitable giving program.
  • Fitness reimbursement.
  • Life insurance and pension fund.

Veeva is committed to fostering an inclusive, diverse workforce. If you need assistance or accommodations during the application process, please contact us.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Backend Developer (f/m/d) Remote (Europe)

DeepOpinion

London

Remote

GBP 75,000 - 85,000

9 days ago

Senior Data Scientist (Remote/Hybrid)

MLabs

London

Remote

GBP 60,000 - 100,000

4 days ago
Be an early applicant

Principal Data Scientist (Remote)

JR United Kingdom

London

Remote

GBP 52,000 - 72,000

23 days ago

Senior Software Engineer - Backend & Machine Learning

Raft

London

On-site

GBP 75,000 - 85,000

3 days ago
Be an early applicant

Data Scientist - Core Analytics

Ripjar

Birmingham

Remote

GBP 50,000 - 90,000

5 days ago
Be an early applicant

Lead Machine Learning Engineer Graph ML

BenchSci

London

On-site

GBP 60,000 - 100,000

7 days ago
Be an early applicant

Solutions Architect

28 TALENT

Remote

GBP 60,000 - 100,000

9 days ago

Machine Learning Engineer

Understanding Recruitment

Greater London

On-site

GBP 60,000 - 100,000

13 days ago

Professional Services Platform Architect Remote, United Kingdom

Abbyy Plc

Remote

GBP 60,000 - 100,000

25 days ago