Enable job alerts via email!

Data Scientist

Siyada Tech

Jeddah

On-site

SAR 200,000 - 300,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Jeddah is looking for a RAG Data Engineer to curate and deliver high-quality data for their advanced AI systems. This role involves maintaining data ingestion pipelines and ensuring that the AI operates with accurate and reliable information. Ideal candidates will have strong skills in data management, preprocessing, and performance monitoring in AI environments.

Qualifications

Experience in data preprocessing and management.
Understanding of AI model performance tuning.
Strong analytical skills and attention to detail.

Responsibilities

Design and maintain ingestion pipelines for various data sources.
Transform unstructured data into usable formats.
Collaborate with AI engineers for performance improvements.
Run quality audits to ensure data accuracy.

Skills

LLM-friendly preprocessing

Metadata and schema design

API integration

Monitoring and observability

Experience with LangChain/LlamaIndex

About Siyada Tech

Siyada Tech is a Saudi technology company at the forefront of AI innovation, agentic AI, and digital transformation. We help organizations embrace the future of intelligent systems and scalable digital solutions.

Title

RAG Data Engineer (The RAG side can be learned)

(Guardian of Context. Slayer of Hallucinations.)

Mission

First be a team player, and Curate, structure, and deliver high-integrity data to our Retrieval-Augmented Generation systems so they stop inventing fairy tales and instead become the most accurate AI in the Kingdom.

Responsibilities

Design and maintain ingestion pipelines for documents, DBs, CRM logs, PDFs, policies, emails, and whatever else management throws your way
Transform unstructured horrors into clean embeddings-ready data (chunking, metadata tagging, semantic structure)
Build automated monitoring for drift, stale knowledge, broken links, and hallucination triggers
Manage vector database lifecycle: indexing strategies, deduplication
Ensure iron-clad data lineage from source to query output
Collaborate with AI engineers to tune retrieval performance (recall, precision, ranking)
Maintain a central knowledge governance system: categorization, versioning, access controls
Run periodic quality audits so the model doesn’t accidentally cite a 2014 blog post as law
Document processes with a clarity unheard of in software teams

Skills & Tools

LLM-friendly preprocessing: chunking logic, semantic splitting, OCR, annotation tools
Metadata, schema design, data modeling for knowledge retrieval
API integration and webhook orchestration
Monitoring and observability for both data quality and AI performance
Bonus points: hands‑on with LangChain/LlamaIndex or custom RAG architectures

Mindset

Gets irrationally angry at duplicated documents
Knows that “data quantity” is not “data quality”
Believes hallucination is a crime against humanity (and product demos)
Works proactively, with an almost religious dedication to truth

KPIs

Latency drops while context accuracy rises
% of content with fresh metadata increased
Meaningfully fewer “Wait, did the AI just make that up?” moments

Why SiyadaTech

You become the unseen architect of truth inside an AI powerhouse. The one who ensures our smartest systems speak facts, not fiction.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.