Attiva gli avvisi di lavoro via e-mail!

GPU Engineer - Contract - Remote

Skillsearch Enterprise Technology

Alessandria

Remoto

EUR 65.000 - 85.000

Tempo pieno

Ieri
Candidati tra i primi

Descrizione del lavoro

A technology consulting firm is seeking a Contract GPU Engineer to conduct performance benchmarking and optimize GPU workloads. This fully remote role requires a minimum of 2 years of experience in GPU engineering. Candidates should have expertise in CUDA and GPU services, along with skills in performance analytics and cloud resource management. Join a dynamic team focusing on advanced GPU technologies.

Competenze

  • Minimum 2 years of hands-on experience in GPU engineering or cloud-based GPU workload optimization.
  • NVIDIA Certification preferred.

Mansioni

  • Optimize performance of applications using GPU parallelization.
  • Conduct thorough analysis of GPU-accelerated serving frameworks.
  • Integrate GPU configuration into CI/CD pipelines.

Conoscenze

GPU services provisioning
GPU-accelerated software development
Performance benchmarking
Infrastructure as Code
CI/CD pipelines implementation
Kubernetes knowledge
Scripting in Python or Bash

Strumenti

CUDA
Terraform
TensorRT
Nsight
Prometheus
Grafana
Descrizione del lavoro

Our client, part of the UN, are looking to hire a contract GPU Engineer initially until the end of 2025.

It's a fully remote position

About the Role

You will conduct comprehensive performance benchmarking, profiling, and tuning of GPU workloads to provide evidence-based recommendations on suitable GPU sharing techniques.

Responsibilities
  • Optimize performance of existing and new applications by leveraging GPU parallelization, identifying bottlenecks, and deploying code and framework-level improvements.
  • Perform a thorough analysis of the deployment methods for GPU-accelerated serving frameworks in the market, with reference implementations and best-practice recommendations for large-scale serving solutions (e.g., NVIDIA Triton Inference Server, TensorRT, ONNX Runtime).
  • Develop repeatable and automated configuration templates for GPU resources.
  • Implement active GPU monitoring, including review and analysis of all relevant metrics (utilization, memory bandwidth, power, temperature, etc.), and establish dashboards and alerts for proactive performance and health management.
  • Integrate GPU resource provisioning and configuration into CI / CD pipelines using Infrastructure as Code (IaC) tools (e.g., Terraform, Helm Charts, etc), and document workflows for seamless deployment and rollback.
  • Document all configurations, testing results, benchmarking analyses, and deployment procedures to ensure transparency and reproducibility.
  • Establish active GPU monitoring protocols, including the identification and evaluation of available metrics, to select the most relevant indicators for ongoing performance management.
  • Support self-service deployment of Large Language Models (LLMs) on GPU resources, enabling application owners with varying technical expertise to access and utilize GPU capabilities seamlessly.
Qualifications
  • Minimum 2 years of hands-on experience in GPU engineering or cloud-based GPU workload optimization, ideally within enterprise or large-scale environments.
  • NVIDIA Certified (Preferred).
Required Skills
  • Direct experience with GPU services, including resource provisioning, scaling, and optimization.
  • Demonstrable expertise in GPU-accelerated software development (CUDA, OpenCL, TensorRT, PyTorch, TensorFlow, ONNX, etc.).
  • Strong background in performance benchmarking, profiling (Nsight, nvprof, or similar tools), and workload tuning.
  • Experience with Infrastructure as Code (Terraform, HELM Charts, or equivalent) for automated cloud resource management.
  • Proven experience designing and implementing CI / CD pipelines for GPU-enabled applications using tools like GitHub Actions (Preferred) or similar.
  • Working knowledge of Kubernetes and GPU scheduling, including setup of GPU-enabled clusters and deployment of GPU workloads in Kubernetes.
  • Familiarity with GPU monitoring and observability, using tools such as Prometheus, Grafana, NVIDIA Data Center GPU Manager (DCGM), or custom scripts.
  • Proven ability to analyze deployment approaches for GPU-accelerated serving frameworks and deliver reference implementations.
  • Experience implementing software quality engineering practices (unit testing, code review, test automation, reproducibility).
  • Strong scripting skills in Python, Bash, or PowerShell for automation and monitoring purposes.
  • Excellent analytical, problem solving, and troubleshooting abilities.
  • Quick learner, adaptable to evolving requirements and emerging GPU / cloud technologies.
  • Positive and collaborative attitude in Agile environments.
Ottieni la revisione del curriculum gratis e riservata.
oppure trascina qui un file PDF, DOC, DOCX, ODT o PAGES di non oltre 5 MB.