Enable job alerts via email!

Gen AI Quality Test Engineer

SCIENTEC CONSULTING PTE. LTD.

Singapore

On-site

SGD 70,000 - 90,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology consulting firm in Singapore is seeking a skilled Gen AI Quality Engineer to test and validate Large Language Models (LLMs) integrated into applications. This role requires at least 4 years of QA experience, with a strong emphasis on detecting and resolving model inaccuracies. Collaborate with developers and AI researchers while maintaining thorough test documentation. The position offers a hybrid working arrangement and a competitive salary package.

Qualifications

4+ years of QA experience with LLM and Gen AI testing.
Strong understanding of testing methodologies and best practices.

Responsibilities

Design and execute test cases for LLM accuracy and performance.
Detect and report hallucinations in model outputs.
Implement automated tests for regression and edge cases.
Collaborate with data scientists to resolve model issues.
Maintain thorough test documentation.

Skills

Detail-oriented

Automated Testing

Functional Testing

Non-Functional Testing

Collaboration

Education

Bachelor’s degree in Computer Science or related field

Working Hours: Monday – Thursday (8.30am – 6pm), Friday (8.30am – 5.30pm) (Hybrid working arrangement)

Working Location: Central

Salary Package: Basic + AWS

We are looking for a skilled and detail-oriented Gen AI Quality Engineer to join our team. In this role, you will be responsible for testing Large Language Models (LLMs) integrated into Gen AI applications like a RAG chatbot and film classification system.

Key Responsibilities

Design and execute comprehensive test cases to evaluate the accuracy, reliability, and performance of LLMs integrated into Gen AI applications. This includes verifying that the model responses are relevant, contextually appropriate, and factually correct.
Focus on detecting hallucinations where the model produces false or fabricated information, ensuring these are promptly reported and addressed by the development team. You will help refine the models to reduce these occurrences.
Assess the accuracy of model outputs, especially in high-precision contexts like chatbot conversations or film classification, ensuring that LLMs produce responses that are both relevant and correct according to predefined business logic.
Implement automated testing for common use cases, edge cases, and regression tests, especially focusing on cases that tend to trigger hallucinations or inaccuracies in the model’s responses.
Functional & Non-Functional Testing: Evaluate the LLM's functionality in different scenarios to check if it meets the functional requirements. Also, perform non-functional testing like performance, load, and stress tests to assess the scalability of LLMs when handling high loads or multiple queries.
Identify and document bugs related to hallucinations, inaccurate outputs, or unexpected model behaviors. Work closely with data scientists and developers to resolve issues, refine models, and ensure quality.
Collaborate with AI researchers, engineers, and product teams to understand the nuances of model training, improve the models based on feedback, and suggest improvements based on test findings.
Ensure that model updates, fine-tuning, or new training data do not introduce regressions or increase hallucinations and inaccuracies. Perform retesting of fixed issues and reassess model accuracy after updates.
Maintain thorough test documentation, including test plans, test cases, test logs, and issue reports focused on LLM-specific concerns like hallucinations and inaccuracies.

Requirements

Bachelor’s degree in Computer Science, Information Systems, or related field.
Minimum 4 years of QA experience, with hands‑on exposure to LLM and Gen AI testing.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top companies

Popular jobs