Enable job alerts via email!

COMPUTER VISION ENGINEER (LLM & AI Integration)

Duncan & Ross

Abu Dhabi

On-site

AED 120,000 - 200,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading tech company in Abu Dhabi is seeking an experienced Computer Vision Engineer to design and implement innovative AI and LLM-integrated solutions. This role involves building advanced computer vision models for various applications and collaborating with data scientists to optimize vision-language functionalities. The ideal candidate has a strong background in computer vision and deep learning frameworks like TensorFlow and PyTorch.

Qualifications

  • 3-7 years of experience in computer vision, deep learning, or multimodal AI.
  • Experience integrating LLMs with vision systems.
  • Strong proficiency in Python and deep learning frameworks.

Responsibilities

  • Develop and implement computer vision models for various applications.
  • Integrate vision models with LLMs to enhance visual content interpretation.
  • Conduct performance benchmarking and optimization of models.

Skills

Python
Deep learning frameworks
Computer vision techniques
Analytical skills
Problem-solving skills

Education

Bachelors or Masters degree in Computer Science or related field
PhD preferred

Tools

TensorFlow
PyTorch
OpenCV
Job description
About the job

COMPUTER VISION ENGINEER (LLM & AI Integration)

JOB SUMMARY

We are seeking an experienced Computer Vision Engineer with a strong background in AI and Large Language Models (LLMs). The ideal candidate will design, build, and deploy computer vision solutions that integrate with generative AI and LLM frameworks to interpret, analyze, and describe visual data. This role bridges the gap between image understanding and natural language processing, enabling intelligent visual-language applications.

KEY RESPONSIBILITIES
  • Develop and implement computer vision models for image classification, object detection, segmentation, facial recognition, and visual understanding.
  • Integrate vision models with LLMs (e.g., GPT, LLaVA, CLIP, or multimodal models) to build systems that interpret and describe visual content.
  • Design AI pipelines that combine text, images, and video data for multimodal learning and reasoning.
  • Utilize deep learning frameworks (TensorFlow, PyTorch, OpenCV) to prototype and deploy models.
  • Collaborate with data scientists and AI researchers to fine-tune vision-language models for specific tasks such as visual QA, captioning, or scene analysis.
  • Implement data preprocessing, augmentation, and annotation pipelines for large-scale image datasets.
  • Conduct performance benchmarking, optimization, and deployment of models in production environments.
  • Research and experiment with emerging techniques in Generative AI, multimodal transformers, and neural architecture optimization.
  • Develop APIs and tools for internal teams to utilize vision + LLM capabilities.
  • Ensure compliance with ethical AI practices, including bias mitigation and data privacy.
QUALIFICATIONS
  • Bachelors or Masters degree in Computer Science, AI, Computer Vision, or related field (PhD preferred).
  • 3-7 years of experience in computer vision, deep learning, or multimodal AI.
  • Strong proficiency in Python and frameworks such as PyTorch, TensorFlow, Keras, and OpenCV.
  • Experience integrating LLMs (GPT, Claude, Gemini, or open-source models) with vision systems.
  • Solid understanding of transformer architectures, CNNs, diffusion models, and attention mechanisms.
  • Familiarity with multimodal datasets (COCO, Visual Genome, etc.) and evaluation metrics for vision tasks.
  • Experience with cloud-based AI tools (Azure AI, AWS Sagemaker, Google Vertex AI, etc.).
  • Ability to write clean, scalable, and production-grade code.
  • Strong analytical, problem-solving, and communication skills.
PREFERRED QUALIFICATIONS
  • Experience with multimodal LLM frameworks such as CLIP, BLIP, LLaVA, or Kosmos-2.
  • Background in natural language processing and prompt engineering.
  • Hands‑on experience with edge deployment (NVIDIA Jetson, OpenVINO, ONNX).
  • Knowledge of reinforcement learning, generative models, or 3D vision.
  • Publications or open-source contributions in AI research are a plus.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.