Job Search and Career Advice Platform

Enable job alerts via email!

Bebo Technologies - Software Engineer - AI/ML Testing

bebo Technologies

Khordha

On-site

INR 8,00,000 - 12,00,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology solutions firm is looking for a QA Specialist to focus on AI/ML testing. The ideal candidate will have experience validating LLM outputs and evaluating RAG systems. Strong knowledge of AI, Machine Learning, and proficiency in Python or Java is required. Join a dynamic team to test and improve AI agents and workflows, ensuring quality and performance in a rapidly evolving field. This role emphasizes collaboration and continuous improvement in AI QA practices.

Qualifications

  • 3-5 years of QA experience with a focus on AI/ML testing.
  • Proficient in functional/non-functional testing, QA methodologies.
  • Knowledge of LLM evaluation and RAG concepts.

Responsibilities

  • Validate LLM outputs for accuracy and usability.
  • Evaluate RAG systems for document relevance and context.
  • Design AI-specific testing strategies for validation.

Skills

Artificial Intelligence
Machine Learning
AI Agents
Large Language Models (LLMs)
Automation Frameworks (Selenium, Playwright)
Python
Java
Analytical Reasoning

Education

BE, B.Tech, M.Tech, MCA or equivalent in Computer Science
Job description
Description
  • BE, B.Tech, M.
  • Tech, MCA or equivalent degree in Computer Science (or related field).
  • 3-5 years of QA experience, with at least 2 years focused on AI/ML testing (LLMs, RAG, AI agents).
  • Proficiency in Artificial Intelligence, Machine Learning, AI Agents, and Large Language Models (LLMs).
  • Strong knowledge of QA methodologies, test design, functional/non-functional testing, and defect lifecycle management.
  • Solid understanding of LLM evaluation, hallucination types, prompt behavior, response scoring, and quality metrics.
  • Good understanding of RAG concepts including vector databases, embeddings, and retrieval relevance.
  • Coding proficiency in Python or Java (both preferred).
  • Hands‑on experience in Web/API automation using frameworks like Playwright, Selenium, or REST Assured.
  • Experience in writing automation scripts, maintaining test frameworks, and integrating test suites into CI/CD pipelines.
  • Ability to analyze large sets of AI outputs for patterns and systemic issues.
  • Excellent analytical reasoning, problem-solving, and communication skills.
Job Responsibilities
  • Test and validate LLM outputs, ensuring accuracy, correctness, completeness, consistency, usability, and hallucination analysis.
  • Evaluate RAG systems, including retrieval accuracy, document relevance, context construction, and full response generation flows.
  • Test AI agents and autonomous workflows, validating decision‑making, task execution, and error handling.
  • Design and execute AI‑specific test strategies: dataset creation, edge‑case testing, adversarial testing, pattern‑based testing, and regression validation.
  • Develop evaluation frameworks, scoring rubrics, and benchmarking models for AI quality assessment.
  • Analyze large volumes of AI‑generated responses to identify patterns, root causes, and issue clusters, instead of isolated defects.
  • Knowledge of testing conversational AI, workflows, or agent‑based systems.
  • Exposure to vector search tools and embedding quality analysis.
  • Validate fixes using new examples from the same pattern category, ensuring true model improvement.
  • Collaborate closely with AI/ML engineers, QA teams, and product managers to improve AI accuracy and performance.
  • Contribute to continuous improvement of AI QA practices, automation, tools, and evaluation datasets.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.