Overview
Are you a skilled GPU Engineer with expertise in optimising GPU-accelerated workloads and cloud-based performance solutions?
This is an opportunity to contribute to a high-impact project, supporting advanced AI and data-driven use cases by designing, profiling, and streamlining GPU workloads in an international organisation.
Responsibilities
- Conduct benchmarking, profiling, and tuning of GPU workloads to provide recommendations on optimal GPU sharing techniques such as MIG, vGPU, MPS, and time-sharing
- Optimise existing and new applications by leveraging GPU parallelisation, identifying bottlenecks, and implementing framework-level improvements
- Analyse deployment methods for GPU-accelerated serving frameworks, with reference implementations and best-practice recommendations (e.g. NVIDIA Triton Inference Server, TensorRT, ONNX Runtime)
- Develop automated configuration templates and workflows for GPU resources using Infrastructure as Code (IaC) tools such as Terraform and Helm
- Implement active GPU monitoring with dashboards and alerts covering utilisation, memory bandwidth, temperature, and power metrics
- Integrate GPU resource provisioning and configuration into CI/CD pipelines to support seamless deployment and rollback
- Support self-service deployment of Large Language Models (LLMs) on GPU resources for application owners
- Document all configurations, benchmarking results, and deployment procedures for transparency and reproducibility
- Deliver knowledge transfer and training sessions for ICT staff on GPU workload optimisation, management, and troubleshooting
What we're looking for
- Minimum of 2 years’ hands-on experience in GPU engineering or cloud-based GPU workload optimisation within enterprise or large-scale environments
- Demonstrable expertise in GPU-accelerated development using CUDA, OpenCL, PyTorch, TensorFlow, TensorRT, and ONNX
- Strong knowledge of performance benchmarking and profiling tools such as Nsight or nvprof
- Proven experience with Infrastructure as Code (Terraform, Helm Charts, or equivalent) and CI/CD pipeline design for GPU-enabled applications
- Working knowledge of Kubernetes and GPU scheduling in containerised environments
- Familiarity with monitoring and observability tools such as Prometheus, Grafana, or NVIDIA DCGM
- Proficiency in scripting languages such as Python, Bash, or PowerShell for automation and monitoring
- NVIDIA certification (preferred)
- Excellent problem-solving, analytical, and troubleshooting skills, with the ability to adapt to evolving requirements
- Strong communication skills with experience collaborating in multicultural teams
If you are a GPU Engineer looking to take the next step in your career and want to contribute your expertise to high-impact optimisation projects in a global environment, this role could be the perfect fit for you. Apply now, or email Richard Fisher - rf@skillsearch.com