Job Search and Career Advice Platform

Enable job alerts via email!

AI Engineer

ELLIOTT MOSS CONSULTING PTE. LTD.

Singapore

On-site

SGD 80,000 - 100,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A consulting firm in AI solutions is looking for an experienced AI Engineer to design and implement Large Language Model-based solutions. Candidates should have a minimum of 3 years in AI/ML engineering, focusing on LLMs and advanced AI workflows. Responsibilities include deploying LLM frameworks, optimizing performance, and collaborating with teams to align AI systems with business needs. The position requires proficiency in Python, Apache Airflow, and vector databases. Competitive compensation offered in Singapore.

Qualifications

  • Minimum 3+ years of experience in Large Language Models (LLMs).
  • Hands-on experience with vLLM, model quantization techniques.
  • Strong proficiency in Apache Airflow for AI pipelines.

Responsibilities

  • Configure and optimize vLLM for low-latency, high-throughput serving.
  • Design RAG pipelines using vector databases.
  • Collaborate with cross-functional teams for alignment on AI solutions.

Skills

Large Language Models (LLMs)
Python
Docker
Kubernetes
Apache Airflow
RAG frameworks
multi-agent systems
AI observability tools

Education

Bachelor’s degree in Information Technology, Computer Science, Finance, or a related field

Tools

vLLM
vector databases (e.g., FAISS, Milvus, Pinecone, Weaviate)
Job description
Job Description
  • We are seeking a skilled AI Engineer with a minimum of 3+ years of hands‑on experience in designing, building, and deploying Large Language Model (LLM)-based solutions.
  • The ideal candidate will be responsible for the end-to-end lifecycle of AI applications, from high-performance model inference and optimization to the development of advanced Agentic AI workflows using RAG and CAG patterns.
  • This role requires close collaboration with product, data, and engineering teams to translate business requirements into scalable, reliable, and cost‑efficient AI systems.
Required Skills & Qualifications
  • Bachelor’s degree in Information Technology, Computer Science, Finance, or a related field.
  • Minimum 3+ years of experience working with Large Language Models (LLMs) in production environments.
  • Hands‑on expertise with vLLM and model quantization techniques such as AWQ and GPTQ.
  • Strong proficiency in Apache Airflow for scheduling and orchestrating complex data and AI pipelines.
  • Experience with RAGFlow or similar deep‑document Retrieval-Augmented Generation (RAG) frameworks.
  • Practical experience with vector databases (e.g., FAISS, Milvus, Pinecone, Weaviate).
  • Proven ability to design and implement multi‑agent systems that leverage tools and external APIs to perform multi‑step tasks.
  • Advanced proficiency in Python, Docker, and Kubernetes.
  • Experience using AI observability and monitoring tools to track latency, cost, throughput, and hallucination rates.
Key Responsibilities
  • Configure, deploy, and optimize vLLM and other inference frameworks to ensure low-latency, high-throughput LLM serving.
  • Design and implement RAG pipelines using vector databases and Cache-Augmented Generation (CAG) strategies to reduce redundant computation and improve response quality.
  • Deploy and tune vLLM clusters to support scalable, production-grade API endpoints for multiple open-source LLMs.
  • Design, implement, and maintain Apache Airflow DAGs and RAGFlow pipelines to automate the AI lifecycle, including data ingestion, indexing, evaluation, and prompt/version management.
  • Develop, version‑control, and continuously refine system prompts, applying techniques such as Chain-of-Thought (CoT) to improve reasoning accuracy and consistency.
  • Implement CAG strategies to optimize KV cache reuse and minimize compute costs for long-context and multi‑step AI tasks.
  • Build and refine Agentic AI workflows, enabling autonomous task planning, tool usage, and API orchestration across different LLM backends.
  • Monitor and analyze AI system performance using observability tools, ensuring reliability, cost efficiency, and controlled hallucination rates.
  • Collaborate with cross-functional teams to align AI solutions with business objectives, security standards, and scalability requirements.
  • Experience Level 3+ years of relevant experience in AI/ML engineering, with demonstrated production experience in LLM-based systems.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.