Aktiviere Job-Benachrichtigungen per E-Mail!
Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf
Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren
A leading technology company seeks a Master's candidate for a thesis project focused on audio quality prediction using multimodal large language models. Responsibilities include literature review, model development, and performance evaluation against established benchmarks. Ideal candidates should have a background in computer science or related fields and experience with relevant programming tools.
Recent advancements in audio / visual large language models (LLMs) have demonstrated their potential in various audio / visual comprehension tasks. Prior research has shown the effectiveness of fine-tuning LLMs for both reference-free speech quality assessment, as well as “zero-shot” reference-free speech quality assessment, highlighting applicability of LLMs for quality assessment tasks.
This thesis aims to explore and develop a novel approach for predicting the audio quality of compressed audio using multimodal LLMs. The goal is to predict audio quality on the MUSHRA scale by comparing compressed audio with its uncompressed reference.
Responsibilities :
Requirements :