Enable job alerts via email!

Data Scientist - Multimodal LLMs (Speech focus)

Connex

Manchester

On-site

GBP 40,000 - 60,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in Conversational AI is seeking a Data Scientist focused on multimodal LLMs, particularly in speech technologies. This role involves researching, implementing, and deploying cutting-edge machine learning systems. Candidates should have a strong background in machine learning, experience with LLMs, and excellent programming skills in Python. This is a unique opportunity to contribute to a greenfield project that integrates speech and language technologies.

Qualifications

  • Background in machine learning or data science, ideally with a research focus.
  • Experience with LLMs, speech technologies, or multimodal systems.

Responsibilities

  • Researching state-of-the-art approaches for multimodal LLMs.
  • Training and fine-tuning models for performance improvement.
  • Collaborating with product and engineering teams.

Skills

Machine Learning
Data Science
Python
Communication

Education

PhD
MSc

Tools

PyTorch

Job description

ConnexAI recognized by Viva Tech: Top 10 Future Unicorns
Data Scientist - Multimodal LLMs (Speech focus)

Location:

Manchester, UK

ConnexAI is developing an ambitious new product to enhance our large language models with speech-to-speech capabilities. This greenfield project offers a unique opportunity to help define its research direction and build the machine learning systems that will power it. We’re seeking a data scientist with a strong research background in machine learning and a focus on speech or multimodal systems. In this role, you’ll work at the intersection of speech and language technologies, exploring how to integrate these modalities into deployable models. You’ll collaborate closely with engineers, researchers, and product leaders to design, prototype, and deploy state-of-the-art models.

What You'll Be Doing?

Researching the state-of-the-art approaches for incorporating audio data into multimodal LLMs for speech-to-text, text-to-speech, and speech-to-speech tasks

Implementing and adapting techniques from recent academic papers into practical, production-ready solutions

Training and fine-tuning models, and iterating on architectures to improve performance and scalability

Sourcing, curating, and preparing datasets for model training and evaluation

Defining evaluation metrics and testing frameworks for multimodal systems

Collaborating with product and engineering teams to translate research concepts into deployable features

Contributing to improving the team’s workflows to help foster a healthy, productive, and innovation-focused environment

What We're Looking For?

Background in machine learning or data science, ideally with a research focus (PhD or MSc with equivalent industry experience)

Experience working with LLMs, speech technologies (ASR, TTS), or multimodal systems

Strong programming skills in Python, with experience in using PyTorch

Hands-on experience with training and fine-tuning ML models, including setting up experiments and evaluating results

Ability to read, interpret, and implement techniques from recent academic papers into practical, working solutions

Strong communicator, comfortable working across interdisciplinary teams

A collaborative mindset and interest in helping improve team workflows

Curiosity and a willingness to learn

About ConnexAI

ConnexAI is an award-winning Conversational AI platform. Designed by a world-class engineering team, ConnexAI's technology enables organizations to maximize profitability, increase revenue and take productivity to new levels. ConnexAI provides cutting-edge, enterprise-grade AI applications including AI Agent, AI Guru, AI Analytics, ASR, AI Voice, and AI Quality.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.