Enable job alerts via email!

C++ Developer - CUDA, Kernel, Tensorrt, GPU

Dcoded

England

Remote

GBP 50,000 - 70,000

Full time

Today
Be an early applicant

Job summary

A tech company is seeking an experienced CUDA Backend Developer to join their engineering team working on GPU-accelerated projects. Responsibilities include designing and implementing CUDA kernels, optimizing GPU performance, and collaborating with researchers. Strong experience in C++ and CUDA C is required, along with skills in GPU profiling and Linux systems development. This role offers a remote working option within the UK or EU.

Qualifications

  • Strong experience in C++ (11/14/17) and CUDA C programming.
  • Proven track record using GPUs for compute-intensive applications in production environments.
  • Hands-on with CUDA profiling, debugging, and Kernel optimization.

Responsibilities

  • Design and implement GPU kernels in CUDA C, focusing on kernel fusion and on-device operations.
  • Optimize custom models for deployment with TensorRT or similar inference engines.
  • Integrate GPU functionality into Back End APIs and orchestration layers.

Skills

C++ programming
CUDA C programming
GPU optimization
Multi-threaded architectures
Linux systems development
Performance tuning

Tools

Nsight
Docker
Kubernetes
Job description
Overview

We\'re looking for an experienced CUDA Backend Developer to join a high-performance engineering team working on GPU-accelerated simulation and AI workloads. You\'ll collaborate with C++ systems engineers and research scientists to design, implement, and optimize GPU-intensive Back End modules that push the limits of performance and scalability.

What You\'ll Do
  • Design and implement GPU kernels in CUDA C, focusing on:
  • Kernel fusion
  • On-device operations
  • GPU memory optimization
  • Build and use profiling tools (eg, Nsight) to measure and improve GPU utilization, inference latency, and training throughput.
  • Optimize custom models for deployment with TensorRT or similar inference engines.
  • Integrate GPU functionality into Back End APIs and orchestration layers.
  • Work closely with research and engineering teams to translate models into performant CUDA implementations.
What We\'re Looking For
  • Strong experience in C++ (11/14/17) and CUDA C programming.
  • Proven track record using GPUs for compute-intensive applications in production environments.
  • Hands-on with CUDA profiling, debugging, and Kernel optimization.
  • Deep understanding of multi-threaded/multi-process architectures and Linux systems development.
  • Proficiency in low-level I/O, memory management, and performance tuning
Nice to Have
  • Experience with distributed training/inference pipelines.
  • Familiarity with Docker and Kubernetes.
  • Exposure to Embedded systems or hardware-level software integration.

Start: ASAP
Duration: 6 months (strong potential to extend)
Location: Remote (UK or EU-based preferred)
IR35: Outside

If you\'re a GPU performance enthusiast who thrives on complex Back End challenges and wants to contribute to cutting-edge AI systems, we\'d love to hear from you.

Apply now or get in touch directly for a confidential conversation.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.