Enable job alerts via email!

Senior Math Libraries Engineer, CPU and GPU Optimization

NVIDIA

United States

Remote

USD 130,000 - 160,000

Full time

Today
Be an early applicant

Job summary

A leading technology company in the United States is seeking an expert software engineer to design and optimize CUDA-X libraries across its CPU and GPU ecosystem. The ideal candidate will have over 12 years of experience in high-performance computing, advanced C++ skills, and a passion for technology. This role includes responsibilities such as collaborating with teams to deliver math libraries and staying updated on software trends. Offers competitive salaries and generous benefits.

Benefits

Generous benefits package
Competitive salaries

Qualifications

  • 12+ years of experience in high-performance computing or AI applications.
  • Experience with ARM, RISC-V, or x86_64 CPU architectures.
  • Background in numerical methods (e.g., FFT, numerical linear algebra).

Responsibilities

  • Design modern APIs and kernels for math libraries.
  • Collaborate with internal and external partners to meet user needs.
  • Deliver timely releases of math libraries.

Skills

Advanced C++ skills
Collaboration
Communication
Documentation habits
Parallel programming
Numerical methods knowledge

Education

PhD or MSc in Computer Science or related field

Tools

CUDA
OpenCL
Python
CMake
CI/CD
Job description
Overview

NVIDIA is looking for an expert software engineer to help us deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. For over a decade, NVIDIA's accelerated computing platform has revolutionized HPC and AI with applications ranging from COVID-19 research to autonomous machines. Did you know that our team develops the GPU/CPU-accelerated mathematical libraries that make all of this possible?

The hardware and software accelerated computing ecosystem is constantly evolving, including shifts towards hybrid backends, deep integration with high-level languages and ecosystems (such as Python, Numpy, JAX, MLIR…), and optimization at runtime for maximum flexibility and performance. Our libraries follow CUDA Everywhere approach to let developers use highly-optimized mathematical operations on all hardware available in NVIDIA ecosystem. You will be part of a team designing, developing, and optimizing math libraries for the future. If you are passionate about designing modern HPC libraries and want to build software that will stand the test-of-time as it accelerates countless applications, we might have the dream job you have been waiting for!

Responsibilities
  • Design modern, flexible, and easy to use APIs and kernels for math libraries and lead design reviews with all collaborators.

  • Work closely with internal (e.g., Engineering, Product Management) and external partners such as researchers to understand their use cases and requirements.

  • Work with internal and external customers to deliver timely math libraries releases.

  • Become a domain expert by continuously surveying current trends in software systems.

Qualifications
  • PhD or MSc degree in Computer Science, Applied Math, or a related science or engineering field is preferred (or equivalent experience).

  • 12+ years of experience designing and developing software for high-performance computing and/or AI applications.

  • Advanced C++ skills, including modern design paradigms (e.g., template meta-programming, RAII).

  • Parallel programming experience with CUDA, OpenCL or vector programming on CPU (AVX, NEON or similar).

  • Strong collaboration, communication, and documentation habits.

  • Experience with ARM, RISC-V and/or x86_64 CPU architectures.

Ways to stand out
  • Strong background in numerical methods (e.g., FFT, numerical linear algebra).

  • Programming skills with Python, and modern automation setups for both building software (e.g. cmake) as well as testing it (e.g. CI/CD, sanitizers).

  • Background with cross-compilation, setting up CPU/GPU/accelerator (cross-)compilation toolchains and bringing existing codes to new architectures.

  • Experience with CCCL, OpenMP, OpenACC, multi-threading, MPI, PGAS.

  • Experience with scientific and deep learning libraries and frameworks such as PyTorch, JAX, MKL, MAGMA, PETSc, Kokkos, etc.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to unprecedented growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you!

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.