Job Search and Career Advice Platform

Activez les alertes d’offres d’emploi par e-mail !

PhD Thesis proposal (3 years) M/F - Reinforcement Learning for Formal Verification of Liveness [...]

CNRS

France

Sur place

EUR 25 000 - 30 000

Plein temps

Aujourd’hui
Soyez parmi les premiers à postuler

Générez un CV personnalisé en quelques minutes

Décrochez un entretien et gagnez plus. En savoir plus

Résumé du poste

A leading research organization in France is seeking a PhD candidate specializing in computer science. The role focuses on the development of algorithms to address livelock issues in distributed protocols using reinforcement learning. Candidates should have a strong theoretical background, with a master’s degree in computer science or related fields, and be motivated to explore reinforcement learning and formal verification. This full-time position offers an opportunity to contribute to cutting-edge research in a collaborative environment.

Qualifications

  • Excellent theoretical background in computer science or related fields with a (ongoing) master’s degree.
  • Knowledge in reinforcement learning, formal verification or logic.

Responsabilités

  • Develop algorithms for automatically proving the absence of livelocks.
  • Analyze maximal wait times using reinforcement learning algorithms.
  • Model problems as RL problems, focusing on rewards and algorithm scaling.

Connaissances

Reinforcement learning
Formal verification
Computer science theory

Formation

Master's degree in computer science or related fields
Description du poste

Organisation/Company CNRS Department Computer Science Research Field Computer science » Informatics Researcher Profile First Stage Researcher (R1) Positions PhD Positions Country France Application Deadline 3 Jan 2026 - 00:00 (Europe/Paris) Type of Contract Temporary Job Status Full-time Is the job funded through the EU Research Framework Programme? Not funded by a EU programme Is the Job related to staff position within a Research Infrastructure? No

Mitsubishi Electric R&D Centre Europe is the European R&D centre from the Corporate R&D organisation of Mitsubishi Electric. Situated at the heart of Europe’s leading R&D community, Mitsubishi Electric R&D Centre Europe includes two entities, one located in Rennes, France and another located in Livingston, UK. We conduct R&D into next generation communication and information systems, power electronic systems and environment and energy systems.

Inria – IRISA is a joint academic research center in computer science of Inria, CNRS and Université de Rennes, located in Rennes, France. The center offers an international environment; about half of the PhD students and some of the faculty are international.

We are interested in developing formal verification techniques for verifying the correctness of distributed algorithms and concurrent software, focusing, in this work, on liveness properties such as absence of livelocks.

Thesis subject

A livelock in a concurrent software is a situation where two threads are continuously executed, but each thread requires a resource owned by the other thread, and they enter a loop where no progress is made by any of the threads. One example is two polite persons trying to eat soup with a shared spoon: they both request to eat soup, but the person holding the spoon is too polite and passes it to the other person before eating the soup. The other person has the same behavior, so they end up passing the spoon to each other and no one ever actually eats the soup.

Livelocks are notoriously difficult to detect because it is difficult to distinguish them from a trace where the threads just need to wait for a long time before making progress (say, because they are waiting for some computations to terminate). Theoretically, the difficulty can be explained by the fact that absence of livelocks is a liveness property (a proof of violation is necessarily an infinite execution), and not a safety property (for which a proof of violation could be finite).

Reinforcement Learning (RL) is a set of techniques which aims at learning an optimal policy to control a system by interacting with it [6]. Lately RL has been used in test generation where the tester is seen as a policy, and the goal is to find the best tester which achieves maximal coverage, or some other criterion such as reaching a specific location in the code. This has been used in symbolic/concolic execution [5, 4], and fuzzing [1]. Deep reinforcement learning and similar techniques based on neural networks have been used to prove the termination of programs but also to establish their satisfaction of temporal properties [3].

Detailed objectives

We are interested in developing algorithms for automatically proving the absence of livelocks or detecting livelocks bugs in distributed protocols using reinforcement learning algorithms. We suggest developing deep RL algorithms to analyze maximal wait times in distributed protocols. The wait time is the number of steps a process is executed before it gains access to a resource. There are no livelocks if the wait times are always finite. The work consists in modeling this problem as an RL problem, choosing the right rewards and RL algorithms, and making sure it scales to real implementations of distributed algorithms.

Because the overall goal is formal verification, the computed neural policy must be formally verified at the end as in [3]. This can be achieved using SMT solvers or specific abstract interpretation techniques for neural networks.

Here the RL agent chooses at each step the schedule, that is, which process to execute, whether there are packet losses etc. and observes the next global state of the system. It receives a reward of 1 at each step a process waiting to access a resource is executed but without accessing that resource. Thus, the distributed protocol can be seen as a game which the RL agent must learn how to play to exhibit the worst‑case behavior.

The model and RL algorithms can be chosen either to attempt to prove the absence of livelocks and compute bounds on wait times, or to detect livelock bugs. The precise direction to be taken and the weight given to RL versus formal verification in this work can be chosen according to the student’s background and preferences. The work also includes an extensive bibliographic study, the development of the above algorithms, implementation and experiments.

References
  • [1] Konstantin Bottinger, Patrice Godefroid, and Rishabh Singh. Deep reinforcement fuzzing. In 2018 IEEE Security and Privacy Workshops (SPW), pages 116–122. IEEE, 2018.
  • [2] Malay K. Ganai. Dynamic livelock analysis of multi‑threaded programs. In Shaz Qadeer and Serdar Tasiran, editors, Runtime Verification, pages 3–18, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
  • [3] Mirco Giacobbe, Daniel Kroening, Abhinandan Pal, and Michael Tautschnig. Neural model checking. arXiv preprint arXiv:2410.23790, 2024.
  • [4] Jinkyu Koo, Charitha Saumya, Milind Kulkarni, and Saurabh Bagchi. Pyse: Automatic worst‑case test generation by reinforcement learning. In 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pages 136–147, 2019.
  • [5] Ciprian Paduraru, Miruna Paduraru, and Alin Stefanescu. Optimizing decision making in concolic execution using reinforcement learning. In 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pages 52–61, 2020.
  • [6] Richard S Sutton, Andrew G Barto, et al. Reinforcement learning: An introduction. MIT press Cambridge, 1998.
Qualifications
  • Excellent theoretical background in computer science or related fields with a (ongoing) master’s degree.
  • Knowledge (course work, internship, or master’s thesis) in at least one of the two fields: reinforcement learning, and formal verification or logic, and a strong motivation to learn.
Obtenez votre examen gratuit et confidentiel de votre CV.
ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.