Enable job alerts via email!

LLM Evaluation Specialist – Cultural and Linguistic Alignment - Arabic native speaker -

Innodata

Jizan

On-site

SAR 149,000 - 225,000

Full time

4 days ago
Be an early applicant

Job summary

A leading technology firm in Jizan is seeking professionals to evaluate multilingual prompt-response datasets for large language models. The role requires native proficiency in Arabic and familiarity with cultural nuances. Responsibilities include designing rubrics, reviewing translations, and creating prompts to enhance cultural awareness in LLM outputs. Candidates should have a Master's degree and experience in content evaluation. This position offers a chance to impact LLM development in culturally rich contexts.

Qualifications

  • Deep familiarity with cultural norms in the region.
  • Experience in LLM evaluation, content moderation, or linguistic QA preferred.
  • Comfortable working with spreadsheets and evaluation templates.

Responsibilities

  • Define rubrics and evaluate translations for cultural relevance.
  • Revise unnatural translations from English to Arabic.
  • Write prompts to test cultural awareness of LLM models.

Skills

Native proficiency in Arabic
Attention to detail
Familiarity with cultural norms

Education

Master's degree in a relevant field

Tools

Gemini
ChatGPT

Job description

We are looking for linguistically and culturally aware professionals to support the evaluation and enhancement of multilingual prompt-response datasets for large language models (LLMs). This role involves rubric design, evaluation of translations and model outputs, prompt creation, and red teaming focused on identifying and surfacing cultural nuances and biases in LLM behaviour.

Key Responsibilities :

  1. Rubric Definition & Prompt Evaluation
  2. Update rubric definitions with Arabic-specific examples to ensure cultural and linguistic relevance.
  3. Identify the need for additional rubrics tailored to specific languages or regional contexts.
  4. Review prompts translated from English into Arabic and revise where translations appear unnatural or inaccurate.
  5. Write thoughtful prompts to test the cultural awareness of LLM models.
  6. Rate prompt-response pairs using a standardized evaluation template based on rubrics and provide detailed justifications.
  7. Document problematic outputs and annotate them with explanations of rubric violations or cultural insensitivities.

Required Qualifications :

  • Native proficiency in Arabic and deep familiarity with cultural norms in the region.
  • Experience in LLM evaluation, content moderation, or linguistic QA preferred.
  • Strong attention to detail to identify subtle issues in language use, tone, and cultural references.
  • Comfortable working with spreadsheets and evaluation templates.
  • Master’s degree in a relevant field.

Preferred Qualifications :

  • Prior experience with prompt engineering or LLM testing.
  • Familiarity with tools such as Gemini, ChatGPT, or similar LLM platforms.
  • Ability to clearly articulate reasoning behind rubric ratings or prompt edits.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.