Aktiviere Job-Benachrichtigungen per E-Mail!

Master Thesis Worker

Dolby Laboratories

Nürnberg

Vor Ort

EUR 40.000 - 60.000

Vollzeit

Vor 10 Tagen

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Starte ganz am Anfang oder importiere einen vorhandenen Lebenslauf

Zusammenfassung

A leading technology company seeks a Master's candidate for a thesis project focused on audio quality prediction using multimodal large language models. Responsibilities include literature review, model development, and performance evaluation against established benchmarks. Ideal candidates should have a background in computer science or related fields and experience with relevant programming tools.

Qualifikationen

  • Currently pursuing a Master degree in a relevant technical field.
  • Experience with PyTorch, Python, and NumPy.
  • Basic experience in training deep learning models.

Aufgaben

  • Analyze existing methods for audio quality prediction using multimodal LLMs.
  • Adapt and fine-tune existing audio/visual LLMs for audio quality prediction.
  • Evaluate the model's performance against benchmarks and human scores.

Kenntnisse

PyTorch
Python
NumPy

Ausbildung

Master degree in Computer Science, Machine Learning, Statistics, Electrical Engineering

Jobbeschreibung

Recent advancements in audio / visual large language models (LLMs) have demonstrated their potential in various audio / visual comprehension tasks. Prior research has shown the effectiveness of fine-tuning LLMs for both reference-free speech quality assessment, as well as “zero-shot” reference-free speech quality assessment, highlighting applicability of LLMs for quality assessment tasks.

This thesis aims to explore and develop a novel approach for predicting the audio quality of compressed audio using multimodal LLMs. The goal is to predict audio quality on the MUSHRA scale by comparing compressed audio with its uncompressed reference.

Responsibilities :

  • Literature Review : Analyze existing methods for audio quality prediction, focusing on the use of multimodal LLMs.
  • Model Development : Adapt and fine-tune existing audio / visual LLMs to predict full-reference audio quality, and, if needed, explore intelligent prompt engineering techniques.
  • Training and Evaluation : Fine-tune the model using the MUSHRA scale as the target metric. Evaluate the model's performance against established benchmarks and human listening scores.

Requirements :

  • Currently pursuing a Master degree in Computer Science, Machine Learning, Statistics, Electrical Engineering, or a related technical field.
  • Experience with PyTorch, Python, and NumPy.
  • Basic experience in training deep learning models.
Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.