Enable job alerts via email!

Sr Data Scientist GenAI

ZipRecruiter

Dallas (TX)

Remote

USD 130,000 - 160,000

Full time

Today
Be an early applicant

Job summary

A leading recruitment platform seeks a Senior Data Scientist specializing in NLP, LLMs, and Generative AI who will design and deploy machine learning models, mentor junior staff, and collaborate with cross-functional teams. The ideal candidate has over 10 years of experience and deep expertise in Python and various ML frameworks. This position currently allows for remote work due to COVID-19.

Benefits

Competitive salary
Flexible schedule
Opportunity for advancement

Qualifications

  • 10+ years of experience in data science / ML, with substantial work in NLP, LLMs, or Generative AI.
  • Deep hands-on experience in Python, using frameworks like PyTorch, TensorFlow.
  • Proven track record building transformer/NLP / LLM models.

Responsibilities

  • Design, build, fine-tune, and deploy LLMs and NLP models.
  • Own major components of ML pipelines from data ingestion to model deployment.
  • Collaborate with various teams to ensure model reliability and scalability.

Skills

Experience in data science/ML
NLP
LLMs
Generative AI
Python
PyTorch
TensorFlow
Communication

Education

Master's in Computer Science or related field

Tools

HuggingFace
LangChain
Job description
Benefits
  • Competitive salary
  • Flexible schedule
  • Opportunity for advancement

Sr Data Scientist (NLP / LLM / Generative AI)
Location: Dallas, TX

Roles & Responsibilities
  • Design, build, fine-tune, and deploy LLMs, transformer-based NLP models, and GenAI solutions for both batch and real-time/streaming contexts.
  • Own all major components of ML pipelines: data ingestion, cleaning, pre-processing (structured & unstructured), embedding, search & retrieval, prompt engineering, RAG (Retrieval-Augmented).
  • Collaborate closely with ML Engineers, MLOps, software engineering, product, compliance, legal etc., to move models from prototype to production, ensuring reliability, scalability, monitoring, and maintainability.
  • Define and implement evaluation frameworks: accuracy, bias, fairness, hallucination, consistency, latency; run UAT, stress-tests, drift detection.
  • Optimize models and pipelines for performance, cost, and efficiency.
  • Ensure best practices in model development: version control, repeatability, documentation, governance, and ethical AI use.
  • Mentor more junior data scientists; help build team skills in NLP, GenAI practices, prompt engineering, fine-tuning.
  • Identify new use cases; prototype innovations in GenAI/NLP; keep up with latest research and open source developments, decide what to adopt.
Must-Have Qualifications
  • 10+ years of experience in data science / ML, with substantial work in NLP, LLMs, or Generative AI.
  • Deep hands-on experience in Python, using frameworks like PyTorch, TensorFlow, HuggingFace etc.
  • Proven track record building transformer/NLP / LLM models; experience with fine-tuning, prompt engineering.
  • Solid experience with information retrieval / search: keyword + semantic search, embeddings, vector databases.
  • Experience working in production / deploying models (batch and streaming), working with MLOps practices.
  • Strong algorithmic / statistical / mathematical fundamentals. Ability to reason about model behaviour, bias, uncertainty.
  • Good communicator: able to translate complex technical detail to business / non-technical stakeholders.
Nice to Have
  • Master's in Computer Science, Computational Linguistics, Statistics, Machine Learning or related field.
  • Experience with multimodal models (vision + text) or emerging LLMs and agent-based systems.
  • Experience with open source LLMs & toolkits; familiarity with LangChain or similar frameworks.
  • Prior experience in regulated environments (finance, risk, legal, compliance) with strong governance, privacy requirements.

Work remote temporarily due to COVID-19.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.