Enable job alerts via email!

QA Lead (AI/ML Testing)

The Fountain Group

Atlanta (GA)

Remote

Full time

7 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

The Fountain Group seeks a QA Lead for LLM testing. In this 100% remote role, you'll lead comprehensive testing strategies for AI controls, ensuring compliance with standards. If you're skilled in Python and automated testing, this opportunity could be your next big challenge.

Benefits

Health Insurance

Dental Insurance

Vision Insurance

Life Insurance

Disability Insurance

Qualifications

Strong experience in Python coding and writing tests.
Familiarity with evaluating LLM outputs.
Experience with automated testing tools like pytest.

Responsibilities

Lead QA for LLM testing ensuring reliability and performance.
Develop testing strategies for semantic similarity, validation, and evaluation metrics.
Collaborate with various teams to set testing requirements.

Skills

Python

Automated Testing

Testing Strategy Design

GenAI Knowledge

Tools

pytest

Playwright

PAY: $70-80/hour W2.Our company offers our consultants a suite of benefits after a qualification period including health, vision, dental, life and disability insurance.
100% remote role, no expectation of onsite work
W2 Candidates only
6+ month contract role
Manager Notes

This role will be a blend of Manual & Automated Testing – being able to Automate the possible. The team uses Playwright, similar to Selenium. Python + PyTest, Jest Testing. Open to other toolsets.
Seeking someone that can Lead / strategize for QA Testing vs simply execution.
Role will develop comprehensive Testing strategy. Some knowledge/experience with GenAi idealOpen to functions that involves creativity around Testing methods. Legal/HR potentially mentioned as interesting domains, involving sensitive, protected datasets.
Formulating the approach as well as execution, dive into code, as a true QA expert would be ideal.
From a Risk / GenAI perspective, understanding Bias Checking of the GenAi tool

Description:

As the QA lead for LLM testing, will define and execute the technical vision and strategy for AI controls and testing.
Responsibilities will include continuous monitoring, evaluation, and reporting of LLM features to ensure compliance with internal standards, best practices, and external regulations.
Play a key role in risk assessment and mitigation, guiding the responsible development and deployment of LLMs.
Will design and implement test cases for LLM governance and development, enabling your team to define features and mitigate risks.
Develop tools, automation strategies, and data pipelines to support scalable LLM management.
Create standardized reporting templates for both technical and senior leadership audiences, ensuring clear communication of results.
Work will involve close collaboration with tool owners and senior management to present findings, assess risk implications, and propose enhancements to AI tools.

Responsibilities

Lead QA efforts for the platform, focusing on LLM output testing to ensure reliability, accuracy, and performance
Develop and maintain comprehensive testing strategies, including semantic similarity, Q&A validation, claims verification, LLM judge evaluations, and metrics like ROUGE
Collaborate with engineering, product, and data science teams to define testing requirements, thresholds, and standards
Design and implement robust test cases aligned with business goals and user needs
Write and maintain automated tests in Python using frameworks like pytest (prior experience with Opik is not required)
Monitor and improve test stability to support application changes
Establish and track QA KPIs, such as test coverage and stability, to measure and communicate platform quality
Stay updated on industry best practices for GenAI/LLM testing and integrate them into QA processes

Qualifications

Strong experience in writing and maintaining Python code
Familiarity with testing LLM outputs, including semantic similarity, Q&A validation, claims verification, LLM judges, and evaluation metrics like ROUGE
Experience with automated testing tools (e.g., pytest); willingness to learn Opik if unfamiliar
Proven ability to design and implement test strategies for complex systems

Who We Are:
The Fountain Group is a nationwide staffing firm with over 80 Fortune 100-500 clients. Since 2001, TFG has maintained a consistent standard of excellence, and our work is broadly recognized every year through numerous industry performance awards. Our success is a team effort.
Browse our website below for additional information on our company.
The Fountain Group
3407 W Martin Luther King Jr. Dr. Tampa, FL 33607
“We work in Life Sciences, Clinical, Engineering, IT, and more. Above all, we specialize in people.”

#LI-RM1

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs