
Activez les alertes d’offres d’emploi par e-mail !
Générez un CV personnalisé en quelques minutes
Décrochez un entretien et gagnez plus. En savoir plus
Une organisation de recherche sur les interactions biologiques propose un stage M2 axé sur le développement d'outils de prédiction de protéines-carbohydrates. Le candidat idéal maîtrisera Python et a une expérience en apprentissage machine. Ce projet est une excellente opportunité de contribuer à des recherches de pointe dans le domaine de la bioinformatique.
Protein-carbohydrate interactions Deep Learning Molecular dynamics Large-scale analysis
Carbohydrates play essential biological roles as structural components, energy reservoirs, and key mediators of molecular communication at the cell surface. Their diverse architectures enable precise recognition events that regulate immunity, development, and host–pathogen interactions. As a result, protein–carbohydrate contacts influence processes ranging from tumor progression to viral and bacterial infection, making both carbohydrates and their binding proteins valuable targets for therapeutic design. However, comparing protein–carbohydrate interfaces remains challenging due to carbohydrate diversity, ligand flexibility, and experimental limitations.
In 2024 our group released a DIONYSUS database (1) gathering the carbohydrate-containing structures from the Protein Data annotated according to all the available general and carbohydrate-specific information on both proteins and ligands. Moreover, clustering of the non‑covalent carbohydrate binding sites according to their 3D geometry allowed us to reveal missing functional annotations in the state‑of‑the‑art curated databases (2).
In its current state DIONYSUS provides an integrated, user-friendly platform for exploring binding‑site similarities, carbohydrate specificity, and complex quality, offering a robust foundation for comparative analysis of carbohydrate‑binding site but also strong potential for the development of deep learning methods for prediction of protein‑carbohydrate interactions.
The student will have an opportunity to contribute to the following axes actively explored in the framework of the further project development:
An ideal candidate should thus be proficient in python (in particular, in scikit‑learn and pytorch), have experience in machine learning model development and applications, and be fluent with basic structural bioinformatics concepts.
References:
Gheeraert A, Bailly T, Ren Y, Hamraoui A, Te J, Vander Meersche Y, Cretin G, Leon Foun Lin R, Gelly J-C, Pérez S, Guyon Frédéric & Galochkina T. DIONYSUS: a database of protein‑carbohydrate interfaces. Nucleic Acids Res 53(D1), D387‑D395 (2025). DOI: 10.1093/nar/gkae890
Gheeraert A, Guyon F, Pérez S, Galochkina T. Unraveling the diversity of protein‑carbohydrate interfaces: insights from a multi‑scale study. Carbohydr Res, 190377 (2025). DOI: 10.1016/j.carres.2025.109377
Other recent publications of the team:
Vander Meersche Y, Cretin G, Gheeraert A, Gelly J-C, Galochkina T. ATLAS: protein flexibility description from atomistic molecular dynamics simulations. Nucleic Acids Res 52(D1), D384‑D392 (2024). DOI: 10.1093/nar/gkad1084
Vander Meersche Y, Diharce J, Gelly J-C, Galochkina T. Flexibility or uncertainty? A critical assessment of AlphaFold 2 pLDDT. Structure (2025). DOI: 10.1016/j.str.2025.09.001
Vander Meersche Y, Duval G, Cretin G, Gheeraert A, Gelly J-C, Galochkina T. PEGASUS: Prediction of MD‑derived protein flexibility from sequence. Protein Science 34:e70221 (2025). DOI: 10.1002/pro.70221
Vander Meersche Y, Cretin G, de Brevern A G, Gelly J-C, Galochkina T. MEDUSA: Prediction of Protein Flexibility from Sequence, J Mol Biol 433(11), 166882 (2021). DOI: 10.1016/j.jmb.2021.166882