Enable job alerts via email!

Senior AI Inference Engineer: Vulkan & Mobile GPUs

Tether Operations Limited

Johannesburg

Remote

ZAR 800 000 - 1 200 000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading fintech company is seeking an experienced AI Model Engineer to drive innovations in language model inference and optimization. This role demands expertise in GPU acceleration, particularly with Vulkan, and a solid grasp of quantization methods. The ideal candidate will collaborate globally and push the boundaries of AI technology, contributing to the development of cutting-edge fintech solutions.

Qualifications

Expertise in kernel development and model optimization.
Experience with Vulkan for GPU acceleration.
Strong understanding of quantization techniques.

Responsibilities

Implement and optimize inference and fine-tuning kernels for language models.
Design and optimize Vulkan compute shaders.
Collaborate with teams to integrate optimized models into production.

Skills

Proficiency in C++

GPU kernel programming

GPU acceleration with Vulkan framework

Quantization and mixed-precision optimization

Understanding of LoRA fine-tuning

Debugging GPU performance issues

Mobile GPU acceleration

Large language model architectures

A leading fintech company is seeking an experienced AI Model Engineer to drive innovations in language model inference and optimization. This role demands expertise in GPU acceleration, particularly with Vulkan, and a solid grasp of quantization methods. The ideal candidate will collaborate globally and push the boundaries of AI technology, contributing to the development of cutting-edge fintech solutions.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.