Enable job alerts via email!

Remote Senior AI Model Inference Engineer — GPU/Vulkan

Tether Operations Limited

Montreal

Remote

CAD 90,000 - 120,000

Full time

Today

Be an early applicant

Job summary

A leading digital finance company is seeking an experienced AI Model Engineer in Montreal to optimize language models and develop GPU acceleration techniques. You will implement fine-tuning kernels and resolve GPU performance issues. Candidates should have expertise in C++ and Vulkan, with a strong background in model optimization. This is a remote position, allowing for collaboration with a global team of innovators.

Benefits

Collaborative work environment

Flexible working hours

Opportunity to work on cutting-edge technology

Qualifications

Strong background in quantization and mixed-precision model optimization.
Hands-on experience with mobile GPU acceleration and model inference.
Familiarity with large language model architectures.

Responsibilities

Implement and optimize custom inference and fine-tuning kernels for language models.
Design and extend datatype and precision support.
Investigate and resolve GPU acceleration issues.

Skills

C++

GPU kernel programming

GPU acceleration

Vulkan framework

Quantization techniques

LoRA fine-tuning

Mobile GPU debugging

Benchmarking

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.