Activez les alertes d’offres d’emploi par e-mail !

AI Red Teaming & LLM Quality Assurance Specialist

Innodata Inc.

Strasbourg

À distance

EUR 40 000 - 60 000

Temps partiel

Hier
Soyez parmi les premiers à postuler

Résumé du poste

A leading tech company is looking for an AI Red Teaming & LLM Quality Assurance Specialist to evaluate AI-generated content for safety and compliance. This freelance position, based in Strasbourg, requires a strong background in AI red teaming and quality assurance. The selected candidate will work remotely up to 30 hours per week and play a critical role in identifying and mitigating risks associated with large language models.

Qualifications

  • Proven experience in AI red teaming, LLM safety testing, or adversarial prompt design.
  • Familiarity with prompt engineering and ethical considerations in generative AI.
  • Strong background in Quality Assurance and content review.

Responsabilités

  • Conduct Red Teaming exercises to identify unsafe outputs from LLMs.
  • Evaluate AI prompts across various domains.
  • Document findings and vulnerability reports clearly.

Connaissances

AI red teaming
LLM safety testing
Adversarial prompt design
Quality Assurance
Critical thinking

Description du poste

Job Title : AI Red Teaming & LLM Quality Assurance Specialist

Type : Freelance | Remote

Language Requirement : French speakers only

Engagement : 25–30 hours per week

Job Description

We are seeking highly analytical and detail-oriented professionals with experience in Red Teaming, Prompt Evaluation, and AI / LLM Quality Assurance . The selected candidate will play a key role in testing and evaluating AI-generated content to identify vulnerabilities, assess risks, and ensure compliance with safety, ethical, and quality standards.

Key Responsibilities

  • Conduct Red Teaming exercises to identify adversarial, harmful, or unsafe outputs from large language models (LLMs).
  • Evaluate and stress-test AI prompts across multiple domains (e.g., finance, healthcare, security).
  • Develop and apply test cases to assess accuracy, bias, toxicity, hallucinations, and misuse potential.
  • Collaborate with researchers and engineers to report risks and suggest mitigations .
  • Perform manual QA and content validation across model versions, ensuring accuracy, coherence, and adherence to guidelines.
  • Create evaluation frameworks and scoring rubrics for prompt performance and safety compliance.
  • Document findings, edge cases, and vulnerability reports with clarity and structure.

Requirements

  • Proven experience in AI red teaming, LLM safety testing, or adversarial prompt design .
  • Familiarity with prompt engineering, NLP tasks, and ethical considerations in generative AI.
  • Strong background in Quality Assurance, content review, or test case development .
  • Understanding of LLM behaviors, failure modes, and evaluation metrics .
  • Excellent critical thinking, writing, and problem-solving skills .
  • Ability to work independently and meet deadlines.
  • Preferred Qualifications

  • Prior work with OpenAI, Anthropic, Google DeepMind , or other LLM safety initiatives.
  • Experience in risk assessment, red team security testing, or AI policy & governance .
  • Background in linguistics, psychology, or computational ethics is a plus.
  • If Interested, kindly send your resume at : Tsharma@innodata.com

    Obtenez votre examen gratuit et confidentiel de votre CV.
    ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.