Activez les alertes d’offres d’emploi par e-mail !

Natural Language Based Secure Federated Query Building

INRIA

Valbonne

Sur place

EUR 40 000 - 60 000

Plein temps

Il y a 30+ jours

Générez un CV personnalisé en quelques minutes

Décrochez un entretien et gagnez plus. En savoir plus

Résumé du poste

Un institut de recherche français à Valbonne recherche un(e) chercheur(e) en IA pour construire un agent conversationnel et gérer des requêtes fédérées. Le candidat idéal doit avoir un doctorat et une solide expérience avec les technologies du Web sémantique. Les avantages comprennent des repas subventionnés et des congés annuels de 7 semaines.

Prestations

Repas subventionnés

Remboursement partiel des frais de transport

7 semaines de congés annuels

Télétravail possible après 6 mois

Équipement professionnel disponible

Accès à la formation professionnelle

Qualifications

Solide expérience avec les standards et technologies du Web sémantique.
Expérience en requêtes fédérées et gestion de données distribuées.
Bonne maîtrise des techniques de traitement du langage naturel.

Responsabilités

Concevoir un agent conversationnel pour construire des requêtes fédérées.
Gérer le plan de projet, les livrables et les réunions de l’équipe.

Connaissances

Expérience avec des standards et technologies du Web sémantique

Expérience en requêtes fédérées

Expertise en modèles linguistiques

Compétences en développement web

Formation

Doctorat en Informatique / Sciences informatiques

Contexte et atouts du poste

INRIA is the French national research institute dedicated to computer science and applied mathematics and is a founding member of the World-Wide Web Consortium (W3C).

The Inria centre at Université Côte d'Azur includes 42 research teams and 9 support services. The centre's staff (about 500 people) is made up of scientists of diﬀerent nationalities, engineers, technicians, and administrative staff. The teams are mainly located on the university campuses of Sophia Antipolis and Nice as well as Montpellier, in close collaboration with research and higher education laboratories and establishments (Université Côte d'Azur, CNRS, INRAE, INSERM ...), but also with the regional economic players.

The Wimmics team works on the topic of AI on the Web, in particular knowledge graphs and (linked) data representation and processing in the Semantic Web. Wimmics contributes to knowledge formalization and semantic-based methods to extract, control, query, validate, infer, explain and interact with knowledge in epistemic communities on the Web. Wimmics has been involved in numerous European research projects (e.g., CoMMA, SevenPro, Sealife, Palette, Aloof, MIREL, HyperAgents) and national projects with a focus on knowledge representation and reasoning (KRR) in the Semantic Web (e.g., ISICIL, DataLift, SMILK, WASABI, D2KAB, Dekalog).

This research engineer position takes place within the context of the ANR project SaFE-KG, with partners in Nantes and Lyon. The project intends to design a platform for a secure federation of knowledge graphs that goes beyond traditional models that assume public accessibility. Its approach focuses on enabling trustworthy, scalable, and efficient collaboration across organisations while maintaining strict security and compliance. The project aims to create a unified model for representing access and usage rules so that each organisation’s policies remain intact even within a federation. It also seeks to redesign core federation engine components—such as source selection, query decomposition, provenance tracking, and access protocols—to support secure and high-performing federated querying. Finally, SaFE-KG will provide user-friendly tools and natural language interfaces that allow non-technical experts to build secure queries across federations, lowering barriers to collaboration. By addressing these challenges, SaFE-KG intends to protect sensitive data while ensuring responsive and scalable systems for real-world applications.

Mission confiée

The main task of the candidate will be to provide both a federated query builder and an end-user interface. The core idea is to couple generative models and linked data models to support a natural language-based interaction to build a federated query including secured sources and provide results.

The goal is to design a conversational agent supporting a dialogical interaction to incrementally build queries distributed over several sources and explain the obtained results. The process will involve a question-answer session with the user and rely on the latest NLP techniques, in particular techniques coupling language models and knowledge graphs to augment the query context and ground the results in the graph [Wim1, Wim2, Wim5, Wim6].

Integrating results on how graph sampling can be used efficiently for federated query building (from other partners of the project) and our experience in building knowledge graph indexes [Wim3, Wim4], we plan to couple graph sampling techniques and Graph Retrieval Augmented Generation (GraphRAG) to augment language model prompts with context information. We will investigate how such a query-building process can align with the access control model proposed (Task 1) while ensuring that sampling strategies allow us to build efficient contexts (Task 2).

The targeted user interface should interactively and incrementally transform a natural language question into a federated SPARQL query while providing essential data provenance to build trust in the generated queries and their results. This interface will also follow an interaction design methodology for its specification and an HCI protocol for its user evaluation, for which we also have experience [Wim7].

The candidate will also manage project plan, the deliverables and the meetings for our team.

Principales activités

The work plan for that position includes :

Language Model-based Query Building : design and prototyping of a conversational agent for dialogical building and refinement of a query, focusing on a first general scenario with open access.
Interactive Source Selection with Access Control : access-control based RAG and natural language interactions to specify and select the sources and move towards a federated query integrating the security constraints.
Present, explain, and justify results : visual and interactive interface and evaluation protocols to build trust in the obtained results and the conclusion of the dialog.
Follow and manage the tasks of our team in the SaFE-KG project and participate in its calls and meetings and the supervision of internships.

Building on previous experience [wim1, wim5] we will start by adapting and evaluating the performances of different language models, large and small, on the different sub-tasks involved in having a dialogical access to the knowledge graphs. For this first stage, we will consider the federation engine FedUP and our previous work on indexes [wim4].

To move to access control and usage control, we will explore how RAG and GraphRAG techniques, as well as LLMs’ tool models, can be adapted to the problem of generating contexts under access controls constraints at each step of the dialog for building a query and explaining its results. In particular, we will consider alternatives where the federation mechanisms are extended to handle indexes and embeddings at the best location (i.e., locally, at the sources).

To ensure an optimal user experience, we will adopt user-centered protocols to design and evaluate the conversational agent, focusing on interface usability, the relevance of responses to user queries, and the quality of interactions between the user and the agent.

References

[Wim1] C. Ringwald, Fabien Gandon, Catherine Faron, Franck Michel, H. A. Akl. 12 shades of RDF : Impact of Syntaxes on Data Extraction with Language Models. ESWC 2024 Extended Semantic Web Conference. ⟨hal-04581124⟩

[Wim2] C. Ringwald, Fabien Gandon, Catherine Faron, Franck Michel, H. A. Akl. Learning Pattern-Based Extractors from Natural Language and Knowledge Graphs Applying Large Language Models to Wikipedia & the Linked Open Data (POSTER). 38th Annual AAAI Conference on AI , ⟨hal-04526139⟩

[Wim3] P. Maillot, J. Andersen, Sylvie Cazalens, Catherine Faron, Fabien Gandon, Philippe Lamarre, Franck Michel. An Open Platform for Quality Measures in a Linked Data Index.The ACM Web Conference 2024, pp.1087-1090, ⟨10.1145 / 3589335.3651443⟩. ⟨hal-04575211⟩ video

[Wim4] P. Maillot, O. Corby, Catherine Faron, Fabien Gandon, Franck Michel. IndeGx : A Model and a Framework for Indexing RDF Knowledge Graphs with SPARQL-based Test Suits. Journal of Web Semantics, 2023, ⟨10.1016 / j.websem.2023.100775⟩. ⟨hal-03946680⟩

[Wim5] E. Tysinger, M. Pagni, O. Kirchhoffer, F. Mehl, Fabien Gandon, et al. An Artificial Intelligence Agent for Navigating Knowledge Graph Experimental Metabolomics Data. 2023 Swiss Metabolomics Society Annual Meeting, ETH Zurich, ⟨hal-04381448⟩

[Wim6] C. Ringwald, Fabien Gandon, Catherine Faron, Franck Michel, H.A. Akl. Kastor : Fine-tuned Small Language Models for Shape-based Active Relation Extraction. ESWC 2025.

[Wim7] Aline Menin, M. N. Do, C. Dal Sasso Freitas, O. Corby, Catherine Faron, et al.. Using Chained Views and Follow-up Queries to Assist the Visual Exploration of the Web of Big Linked Data. International Journal of Human-Computer Interaction, 2022,

Compétences

Preferably, the candidate should hold a PhD in Informatics / Computer science and must demonstrate aptitudes or matches with most of the following aspects :

Strong experience with Semantic Web standards and technologies
Experience in federated queries and possibly other techniques of distributed data management, querying, crawling, indexing, federating, etc.
Expertise in language models, natural language processing, question-answering.
High motivation for scientific research in an open science context
Good Web development technical skills

Other appreciated skills

Some knowledge of HCI design and evaluation methods
Language : excellent English oral and writing skills
Writing skills and motivation for publication
Aptitude to work with others and engage in collaborations
Autonomy and initiative, take on technical decisions within the project and justification of choices
Remote working capabilities (emails, collaborative tools, trackers, etc.)

Avantages

Subsidized meals
Partial reimbursement of public transport costs
Leave : 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training
Contribution to mutual insurance (subject to conditions)

Rémunération

From 2692 € gross monthly (according to degree and experience).

Obtenez votre examen gratuit et confidentiel de votre CV.

ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.

Noté « Excellent » sur la base de 19 689 évaluations

Lieux principaux

Principales entreprises

Postes les plus recherchés