Enable job alerts via email!

Data Scientist

Siyada Tech

Jeddah

On-site

SAR 200,000 - 300,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Jeddah is looking for a RAG Data Engineer to curate and deliver high-quality data for their advanced AI systems. This role involves maintaining data ingestion pipelines and ensuring that the AI operates with accurate and reliable information. Ideal candidates will have strong skills in data management, preprocessing, and performance monitoring in AI environments.

Qualifications

  • Experience in data preprocessing and management.
  • Understanding of AI model performance tuning.
  • Strong analytical skills and attention to detail.

Responsibilities

  • Design and maintain ingestion pipelines for various data sources.
  • Transform unstructured data into usable formats.
  • Collaborate with AI engineers for performance improvements.
  • Run quality audits to ensure data accuracy.

Skills

LLM-friendly preprocessing
Metadata and schema design
API integration
Monitoring and observability
Experience with LangChain/LlamaIndex
Job description

About Siyada Tech

Siyada Tech is a Saudi technology company at the forefront of AI innovation, agentic AI, and digital transformation. We help organizations embrace the future of intelligent systems and scalable digital solutions.

Title

RAG Data Engineer (The RAG side can be learned)

(Guardian of Context. Slayer of Hallucinations.)

Mission

First be a team player, and Curate, structure, and deliver high-integrity data to our Retrieval-Augmented Generation systems so they stop inventing fairy tales and instead become the most accurate AI in the Kingdom.

Responsibilities
  • Design and maintain ingestion pipelines for documents, DBs, CRM logs, PDFs, policies, emails, and whatever else management throws your way
  • Transform unstructured horrors into clean embeddings-ready data (chunking, metadata tagging, semantic structure)
  • Build automated monitoring for drift, stale knowledge, broken links, and hallucination triggers
  • Manage vector database lifecycle: indexing strategies, deduplication
  • Ensure iron-clad data lineage from source to query output
  • Collaborate with AI engineers to tune retrieval performance (recall, precision, ranking)
  • Maintain a central knowledge governance system: categorization, versioning, access controls
  • Run periodic quality audits so the model doesn’t accidentally cite a 2014 blog post as law
  • Document processes with a clarity unheard of in software teams
Skills & Tools
  • LLM-friendly preprocessing: chunking logic, semantic splitting, OCR, annotation tools
  • Metadata, schema design, data modeling for knowledge retrieval
  • API integration and webhook orchestration
  • Monitoring and observability for both data quality and AI performance
  • Bonus points: hands‑on with LangChain/LlamaIndex or custom RAG architectures
Mindset
  • Gets irrationally angry at duplicated documents
  • Knows that “data quantity” is not “data quality”
  • Believes hallucination is a crime against humanity (and product demos)
  • Works proactively, with an almost religious dedication to truth
KPIs
  • Latency drops while context accuracy rises
  • % of content with fresh metadata increased
  • Meaningfully fewer “Wait, did the AI just make that up?” moments
Why SiyadaTech
  • You become the unseen architect of truth inside an AI powerhouse. The one who ensures our smartest systems speak facts, not fiction.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.