Enable job alerts via email!

Senior Kernel Developer

Luxoft

United Kingdom

Remote

GBP 50,000 - 70,000

Full time

Today
Be an early applicant

Job summary

A technology consulting company in the United Kingdom is seeking an AI software development engineer to develop and optimize ML kernels. This role involves working on matrix multiplication and other operators, collaborating with GPU architecture teams, and applying best practices in software engineering. The ideal candidate has strong proficiency in C/C++ and CUDA programming, as well as solid problem-solving skills. The position offers a dynamic and impactful work environment.

Qualifications

  • Must have proficiency with C/C++.
  • Proficient in CUDA or HIP/ROCm or OpenCL programming.
  • Solid understanding of parallel programming models and optimization techniques.

Responsibilities

  • Develop ML kernels for matrix multiplication, Flash Attention, and other ML operators.
  • Benchmark, perform competitive analysis, and optimize kernels to improve performance.
  • Collaborate with the GPU architecture team to improve future generations.

Skills

C/C++ proficiency
CUDA or HIP/ROCm or OpenCL programming
Parallel programming models
Optimization techniques
Problem-solving skills
Collaborative environment

Tools

PyTorch
JAX
MLIR
LLVM
GPU assembly
Job description

Project description

Luxoft is looking for an AI software development engineer to develop ML kernels in the Triton kernel language. We are looking for an engineer who is passionate about optimizing Machine Learning GPU kernels and improving the performance of key applications and benchmarks. What you do directly impacts the performance of AMD GPUs and enables us to become a competitive solution for generative AI. Become a part of our high-impact and incredibly talented Triton kernels team.

Responsibilities
  • Develop ML kernels for matrix multiplication, Flash Attention and other ML operators
  • Benchmark, perform competitive analysis and optimize kernels to improve performance
  • Collaborate with the GPU architecture team to improve future generations
  • Apply knowledge of software engineering best practices
SKILLS

Must have

  • Proficiency with C/C++
  • Proficiency in CUDA or HIP / ROCm or OpenCL programming
  • Solid understanding of parallel programming models, and optimization techniques
  • Strong problem-solving skills and the ability to work in a collaborative environment

Nice to have

  • Familiarity with models like LLama, Mixtral and Gemma is a plus
  • Knowledge of MLIR, LLVM and GPU assembly and GPU architecture is a plus
  • Familiarity with PyTorch or JAX
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.