Enable job alerts via email!

Machine Learning Performance Engineer

Oxford Knight

London

On-site

GBP 200,000 +

Full time

5 days ago
Be an early applicant

Job summary

A tech-centric prop trading fund in London is looking for a Machine Learning Performance Engineer. The successful candidate will optimize the performance of machine learning models, focusing on both training and inference. Ideal candidates will have strong knowledge of modern ML techniques, GPU programming, and will enjoy solving complex problems. The role offers a generous benefits package and emphasizes learning and development.

Benefits

Physical & mental health benefits
Excellent holiday entitlement
Significant parental leave
Retirement benefits
Private on-site gym
Tuition reimbursement
Recreation spaces with meals and snacks

Qualifications

  • Understanding of ML techniques and performance optimization.
  • Experience with debugging performance of training runs.
  • Knowledge of CUDA, GPU programming, and memory hierarchy.

Responsibilities

  • Optimize performance of ML models for training and inference.
  • Work on efficient large-scale training and low-latency inference.
  • Enhance whole-system performance including storage and networking.

Skills

Modern ML techniques and toolsets
Debugging and optimization tooling
Low-level GPU and compute cluster knowledge

Tools

CUDA
NSight Systems
cuDNN

Job description

Social network you want to login/join with:

Machine Learning Performance Engineer, London

col-narrow-left

Client:

Oxford Knight

Location:

London, United Kingdom

Job Category:

Other

-

EU work permit required:

Yes

col-narrow-right

Job Reference:

bff6f7efc14f

Job Views:

33

Posted:

12.08.2025

col-wide

Job Description:

Summary:

Exciting opportunity to work at a tech-centric prop trading fund which trades a wide range of financial products, with offices across the globe. Looking for an experienced engineer with low-level systems programming and optimization expertise to join their growing ML team.

Machine learning is front and centre at this firm, and your focus will be to optimize the performance of their models: both training and inference. They’re interested in efficient large-scale training, low-latency inference in real-time systems, and high-throughput inference in research. Partly this will involve improving straightforward CUDA, but they also need a whole-systems approach, including storage systems, networking, and host- and GPU-level considerations.

The successful candidate will be a smart, curious software engineer who enjoys finding solutions for complex problems. If you also have a great appetite for learning new things, this role is for you!

Requirements:

  • An understanding of modern ML techniques and toolsets, with a strong focus on performance
  • The systems knowledge & experience required to debug a training run’s performance end to end
  • Low-level GPU and compute cluster knowledge, CUDA or other types of GPU programming, e.g. PTX, SASS, warps, cooperative groups, Tensor Cores, & the memory hierarchy
  • Debugging/optimization tooling experience, e.g. CUDA GDB, NSight Systems, NSight Compute, etc.
  • Library knowledge of Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLAS
  • Generous benefits package, including physical & mental health benefits, excellent holiday entitlement, significant parental leave, retirement benefits, private on-site gym
  • Focus on learning & development with tuition reimbursement
  • Recreation spaces with breakfast, lunch, snacks and treats

Whilst we carefully review all applications, to all jobs, due to the high volume of applications we receive it is not possible to respond to those who have not been successful.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs