Enable job alerts via email!

Senior Researcher - Diffusion-Based Large Language Models

Huawei Technologies Canada Co., Ltd.

Montreal

On-site

CAD 100,000 - 140,000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Montreal is seeking a Senior Researcher to advance AI research, especially in diffusion-based language models. Candidates should have a Ph.D. in a related field, strong generative modeling skills, and expertise in Python and deep learning frameworks. This position emphasizes both individual research contributions and collaboration with engineering teams. Proficiency in English is essential for communication with international colleagues.

Qualifications

Deep understanding of generative modeling and diffusion models.
Hands-on expertise with modern LLM architectures.
Strong theoretical grounding in machine learning.

Responsibilities

Propose and develop novel research ideas.
Deliver end-to-end projects and analyze results.
Publish research in top-tier ML and NLP venues.

Skills

Generative modeling expertise

Deep understanding of diffusion models

Strong programming skills in Python

Hands-on expertise with modern LLM architectures

Excellent communication skills

Education

Ph.D. in Computer Science or related field

Tools

Python

PyTorch

Hugging Face Transformers

Huawei Canada has an immediate permanent opening for a Senior Researcher.

About the team:

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state‑of‑the‑art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job:

Propose and develop novel research ideas to advance the state of the art in diffusion‑based language models and generative modeling for text (and possibly multimodal).
Design and study non‑autoregressive and diffusion‑style architectures for LLMs, including discrete/continuous diffusion, denoising objectives, and hybrid transformer–diffusion models.
Participate in applied research projects focused on: scaling diffusion LLMs to long context, improving sample quality and controllability, and reducing inference latency and compute costs.
Deliver end‑to‑end projects: design and run experiments, implement and debug large‑scale training pipelines, analyze results, and clearly communicate findings to both technical and non‑technical stakeholders.
Collaborate with data and infra teams on data curation, evaluation, and post‑training for diffusion LLMs (e.g., fine‑tuning, RLHF/feedback, safety, and robustness).
Publish research in top‑tier machine learning and NLP venues and contribute to open‑source libraries, benchmarks, or model releases where appropriate.
(If Senior) Supervise and mentor a small group of researchers and engineers to deliver projects and publish conference papers.
Track advances across diffusion models, LLMs, and the broader AI ecosystem, and translate new developments into actionable research directions for the team.

About the ideal candidate:

A Ph.D. in Computer Science, Machine Learning, Statistics, Applied Mathematics, or a related field (or equivalent research experience).
First‑author publications in top‑tier ML / NLP venues such as NeurIPS, ICML, ICLR, ACL, EMNLP, or NAACL.
Deep understanding of generative modeling and diffusion models (e.g., denoising diffusion, score‑based models, flow‑matching) and how they compare to standard autoregressive LLMs.
Strong theoretical grounding in machine learning (optimization, probabilistic modeling, representation learning) and practical experience turning theory into working systems.
Hands‑on expertise with: modern LLM architectures (e.g., Transformer‑based models like Llama, Qwen), and diffusion or score‑based models for text, images, or multimodal generation.
Strong programming skills in Python with expertise in PyTorch, and experience implementing custom training loops, distributed training, and large‑scale experimentation.
Experience with modern LLM / diffusion training and inference stacks (e.g., Hugging Face Transformers & Diffusers, Accelerate, DeepSpeed, Megatron‑LM, vLLM or similar).
Familiarity with the evaluation of generative models for language and/or multimodal tasks, including designing robust automatic and human evaluations.
Excellent oral and written communication skills, including the ability to clearly present complex ideas and results, write papers, and collaborate across research and engineering teams.

Huawei aims to support a French‑speaking work environment for its employees in Quebec. We have taken steps to avoid requiring a language other than French for this position. However, proficiency in English is essential for this role for the following reasons:

The person will be required to communicate regularly with colleagues located outside Quebec, where English is the primary language used for communication between offices. In addition, the nature of the tasks related to this position, which falls within a highly specialized field of artificial intelligence, also requires knowledge of English.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions