Technology » Standardisation of technologies
Technology » Telecommunications technology
Engineering » Sound engineering
Organisation/Company IMT Atlantique Department Doctoral division Research Field Computer science » Informatics Computer science » Programming Technology » Sound technology Technology » Communication technology Technology » Standardisation of technologies Technology » Telecommunications technology Engineering » Communication engineering Engineering » Sound engineering Mathematics » Applied mathematics Researcher Profile First Stage Researcher (R1) Positions PhD Positions Country France Application Deadline 30 Sep 2025 - 23:00 (Europe/Paris) Type of Contract Temporary Job Status Full-time Offer Starting Date 1 Oct 2025 Is the job funded through the EU Research Framework Programme? Not funded by a EU programme Is the Job related to staff position within a Research Infrastructure? No
Your role is to carry out a PhD work on the subject: “Advanced neural coding for mono and stereo audio signals”.
• Overall context and problem statement
Audio compression (or audio coding) is a field that originated in source coding, with a long history marked by the development of numerous codecs, some of which are well known to the general public, such as MP3 or AAC for music transmission or storage.
In recent years, the field of audio coding has been shaken up by deep learning technologies. Artificial neural networks make it possible to achieve very low compression rates.
As a result, a new generation of multimedia signal compression methods has emerged, based on deep learning. Auto-encoder architectures based on Generative Adversarial Network (GAN) learning give very good results, with codecs such as SoundStream, EnCodec, or Descript Audio Codec (DAC). Other approaches, such as diffusion models, are also being investigated.
Current neural audio codecs are essentially mono. Compared with “traditional” codecs, they are generally much more complex (in terms of computational resources), requiring very significant storage (on the order of 10 to 80M parameters, for example).
• Scientific objective – expected outcome and challenges to be addressed
In this context, the aim of the thesis is to design and develop innovative audio coding methods based on deep learning, for mono and stereo signals.
In particular, the thesis will aim to address the following challenges:
Recent approaches such as transformers or diffusion models will be studied, and new neural network architectures will be tested and explored.
• Indicative list of references
1. Minje Kim and Jan Skoglund, “Neural Speech and Audio Coding,” arXiv:2408.06954v1, 2024
2. Thomas Muller, Stephane Ragot, Laetitia Gros, Pierrick Philippe, Pascal Scalart, Speech quality evaluation of neural audio codecs, Interspeech, 2024
3. N. Zeghidour et al., “SoundStream: An End-to-End Neural Audio Codec,” IEEE/ACM Trans. TASLP, 2021, arXiv:2107.03312
4. R. Kumar et al., “High-Fidelity Audio Compression with Improved RVQGAN,” in Advances in Neural Information Processing Systems, vol. 36, 2023.
5. J.D Parker et al., Scaling Transformers for Low-Bitrate High-Quality Speech Coding, arXiv:2411.19842, Nov. 2024
6. Yaoxun Xu, et al., “MuCodec: Ultra Low-Bitrate Music Codec,” arXiv:2409.13216, Sep. 2024
Description of the entity and of the team
Orange Innovation brings together the research and innovation activities and expertise of the Group's entities and countries. We work every day to ensure that Orange is recognized as an innovative operator by its customers and we create value for the Group and the Brand in each of our projects. With 720 researchers, thousands of marketers, developers, designers and data analysts, it is the expertise of our 6,000 employees that fuels this ambition every day.
Orange Innovation anticipates technological breakthroughs and supports the Group's countries and entities in making the best technological choices to meet the needs of our consumer and business customers.
At Innovation, you will be part of a team at the cutting edge of innovation and expertise in audio signal processing. The thesis focuses on neural network audio compression, which is a very active research field, with many open questions still to be explored. Neural audio compression is already integrated into certain services, results of the PhD work may be directly transferred to real-life products or services.
Minimum Requirements: