Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
A leading recruitment firm is seeking an experienced Data Engineer to develop ingestion pipelines and manage complex datasets in Abu Dhabi. The role requires a strong background in Python and data management, alongside a deep understanding of LLM data constraints. Candidates should have over 10 years of experience in data engineering and possess relevant technical skills to support cutting-edge AI projects.
Bachelors in Computer Application(Computers)
Nationality
Any Nationality
Vacancy
1 Vacancy
Job Description
Job Description
· Build ingestion pipelines for structured/unstructured data using Python
· Clean, normalize, and prepare data formats suitable for LLM fine-tuning (e.g., JSONL, CSV)
· Create high-quality, task-specific datasets for training and evaluation
· Apply versioning to datasets using DVC or LakeFS for reproducibility
· Generate embeddings using HuggingFace or Sentence Transformers
· Manage vector indexes (FAISS, Weaviate) and optimize retrieval workflows
· Tokenize and chunk long-form data for context window optimization
Requirements
· 10+ years’ experience in Data Engineering role
· 2+ years’ experience in AI-adjacent data role
· Proficiency in Python, pandas, and text processing tools
· Familiarity with tokenization libraries (HuggingFace Tokenizers, SentencePiece)
· Experience managing datasets and object storage (MinIO, NFS)
· Understanding of LLM data constraints (context windows, formatting, prompt injection)
Company Industry
Department / Functional Area
Keywords
Disclaimer: Naukrigulf.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@naukrigulf.com