Activez les alertes d’offres d’emploi par e-mail !

SBS - GenAI R&D Automation Testing Senior Software Quality Engineer - SBS - Paris

Sopra Steria

Courbevoie

Sur place

EUR 45 000 - 68 000

Plein temps

Il y a 30+ jours

Résumé du poste

A leading company in AI technology seeks a GenAI QA Engineer to develop and maintain quality assurance systems for their AI agent platform. The role involves designing automated testing frameworks, collaborating with engineers, and ensuring compliance in AI responses, making it crucial for the performance and reliability of their systems.

Qualifications

5 years in software testing with 2 years focused on AI/ML systems.
Experience testing chatbots or conversational AI is a must.
Familiar with Python and LLM frameworks like LangChain.

Responsabilités

Design automated testing frameworks for RAG pipelines.
Collaborate with AI engineers to define quality metrics.
Ensure compliance and validate data privacy measures.

Connaissances

Testing LLM applications

Understanding of RAG architectures

Proficiency in Python

Familiarity with NLP concepts

Knowledge of embedding models

Formation

Bachelors degree in Computer Science or related field

Outils

pytest

MLflow

Locust

As a GenAI QA Engineer you will ensure the quality and reliability of our RAG-based AI agent platform. Your responsibilities include :

Design and implement automated testing frameworks for RAG pipelines including :

Vector database performance and accuracy testing
Retrieval quality metrics and relevance scoring
LLM response validation and hallucination detection
End-to-end agent conversation flow testing

Develop specialized test suites for AI / ML components :

Knowledge base ingestion and chunking strategies
Embedding quality and semantic search accuracy
Prompt injection and security vulnerability testing
Multi-modal content handling (documents tables images)

Create automated evaluation frameworks for :

Agent response accuracy and consistency
Contextual understanding and reasoning capabilities
Performance benchmarking across different LLMs
A / B testing for prompt engineering optimization

Collaborate with AI engineers to :

Define quality metrics for RAG architectures
Establish ground truth datasets for evaluation
Design test scenarios for edge cases and failure modes

Build testing infrastructure for :

Knowledge base versioning and rollback testing
API rate limiting and scalability testing
Integration testing with customer systems

Ensure compliance and safety :

Test for bias and fairness in AI responses
Validate data privacy and security measures
Implement guardrails testing for harmful content
Document AI system limitations and failure modes

Develop comprehensive test strategies for RAG-based AI agents.

Create automated benchmarks for retrieval quality and response accuracy.

Build dashboards for monitoring AI system performance in production.

Collaborate with customers to understand their AI agent requirements.

Contribute to AI safety and alignment best practices.

Qualifications : Required Skills :

Education : Bachelors degree in Computer Science Engineering AI / ML or related field.

Experience : 5 years in software testing with at least 2 years focused on AI / ML systems.

Experience testing LLM applications chatbots or conversational AI
Understanding of RAG architectures and vector databases (Pinecone Weaviate Qdrant)
Familiarity with embedding models and similarity search concepts
Knowledge of prompt engineering and LLM evaluation metrics

Technical Skills :

Proficiency in Python for test automation and AI / ML frameworks
Experience with LLM frameworks ( LangChain LlamaIndex Haystack )
API testing for RESTful services and streaming endpoints
Familiarity with ML testing tools (MLflow Weights & Biases Neptune)

Automation Frameworks :

pytest unittest for Python-based testing
Experience with async testing for streaming responses
Load testing tools for AI endpoints (Locust K6)
CI / CD integration with model deployment pipelines

Domain Knowledge :

Understanding of NLP concepts and evaluation metrics (BLEU ROUGE BERTScore)
Knowledge of information retrieval metrics (precision recall MRR)
Familiarity with financial services use cases for AI agents
Understanding of responsible AI principles

Preferred Qualifications :

Experience with cloud AI services (AWS Bedrock Azure OpenAI Google Vertex AI)
Knowledge of vector database optimization and indexing strategies
Familiarity with fine-tuning and model evaluation workflows
Experience with multilingual AI systems testing
Understanding of regulatory requirements for AI in financial services (EU AI Act GDPR)
Contributions to open-source AI / ML testing frameworks

Remote Work : Employment Type :

Full-time

Key Skills

Laboratory Experience,Vendor Management,Design Controls,C / C++,FDA Regulations,Intellectual Property Law,ISO 13485,Research Experience,SolidWorks,Research & Development,Internet Of Things,Product Development

Experience : years

Vacancy : 1

Obtenez votre examen gratuit et confidentiel de votre CV.

ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.

Noté « Excellent » sur la base de 17 334 évaluations

SBS - GenAI R&D Automation Testing Senior Software Quality Engineer - SBS - Paris

Sopra Steria

Courbevoie

Sur place

EUR 45 000 - 68 000

Plein temps

Résumé du poste

Qualifications

Responsabilités

Connaissances

Formation

Outils

Description du poste

Entreprise

Services

Ressources gratuites

Assistance

SBS - GenAI R&D Automation Testing Senior Software Quality Engineer - SBS - Paris

Sopra Steria

Courbevoie

Sur place

EUR 45 000 - 68 000

Plein temps

Résumé du poste

Qualifications

Responsabilités

Connaissances

Formation

Outils

Description du poste

Suivez-nous

Entreprise

Services

Ressources gratuites

Assistance

EUR 45 000 - 68 000