Job Search and Career Advice Platform

Enable job alerts via email!

HPC Software Engineer for Large-Scale AI Training

Institute of Foundation Models

Abu Dhabi

On-site

AED 120,000 - 200,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading research lab in the UAE seeks a High Performance Computing Software Engineer to design and develop software for large-scale AI workloads. You will work at the intersection of HPC and machine learning, ensuring robust infrastructure for cutting-edge model training. Ideal candidates should have experience with ML frameworks and distributed communication libraries. Join a collaborative environment driving significant advancements in AI.

Qualifications

  • Proven experience developing and optimizing software for large-scale ML workloads.
  • Deep understanding of Linux kernel internals and accelerator kernel development.
  • Proficiency with distributed communication libraries like NCCL, MPI, UCX.

Responsibilities

  • Design and implement high-performance software solutions for AI/ML training.
  • Optimize low-level system components including Linux kernels and GPU/accelerator kernels.
  • Develop and tune communication libraries for large-scale production environments.

Skills

Software development for large-scale ML workloads
Linux kernel internals
Distributed communication libraries
Experience with ML frameworks
HPC job scheduling tools
Debugging and performance tuning
Collaborative mindset
Job description
A leading research lab in the UAE seeks a High Performance Computing Software Engineer to design and develop software for large-scale AI workloads. You will work at the intersection of HPC and machine learning, ensuring robust infrastructure for cutting-edge model training. Ideal candidates should have experience with ML frameworks and distributed communication libraries. Join a collaborative environment driving significant advancements in AI.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.