Enable job alerts via email!

Senior Software Engineer - GPU Networking and Communication Middleware

Advanced Micro Devices, Inc.

California, Santa Clara (MO, CA)

Hybrid

USD 120,000 - 180,000

Full time

9 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Senior Software Engineer, where your work will directly impact the evolution of GPU technology. You will design and implement cutting-edge networking features for high-performance computing and machine learning applications. Collaborate with experts in a dynamic environment to push the boundaries of what's possible in data center technology. This role offers the chance to contribute to innovative AI-powered products and be part of a team that values collaboration, creativity, and excellence. If you're passionate about technology and eager to make a difference, this is the perfect opportunity for you.

Benefits

Health Insurance
Retirement Plans
Flexible Work Hours
Professional Development Opportunities
Employee Discounts
Wellness Programs

Qualifications

  • Strong background in developing system software in C/C++.
  • Familiarity with GPU programming in HIP or CUDA is a plus.

Responsibilities

  • Design and implement features to enhance GPU support in communication libraries.
  • Benchmark and optimize code for multi-node GPU applications.

Skills

C/C++ System Software Development
Communication Middleware (MPI/SHMEM)
Lower-level Communication Frameworks (UCX, libfabric)
GPU Programming (HIP, CUDA)
Software Development Best Practices
Open-source Contributions

Education

Bachelor's degree in Computer Science
Advanced degrees (M.Sc., M.Eng., Ph.D.)

Job description



WHAT YOU DO AT AMD CHANGES EVERYTHING


We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.


AMD together we advance_




Senior Software Engineer - Data Center GPU Networking and Communication Middleware

THE TEAM:

AMD's Data Center GPU organization is transforming the industry with our AI-based Graphic Processors. Our primary objective is to design exceptional products that drive the evolution of computing experience, serving as the cornerstone for enterprise Data Centers, (AI) Artificial Intelligence, HPC and Embedded systems. If this resonates with you, come and joining our Data Center GPU organization where we are building amazing AI powered products with amazing people.

THE ROLE:

As a GPU network software engineer you will design, implement, and test networking features in communication libraries, middleware, and frameworks to provide best in class support for GPU applications running high performance computing and machine learning workloads at scale. You will work with technical experts within AMD, our partners, and the open-source community to implement these features as part of AMD's Radeon Open Ecosystem (ROCm).

THE PERSON:

You are accustomed to working in a dynamic, geographically distributed agile team, where partnership and collaboration are paramount. You possess excellent written and verbal communication skills, and strong attention to detail. You are results-oriented and accustomed to tight deadlines and changing priorities. Most importantly, you are constantly thinking of ways to improve performance of multi-node GPU applications.

KEY RESPONSIBILITIES:

  • Design, implement, and test features to enhance GPU support in communication libraries, middleware and frameworks
  • Benchmark, profile and optimize code to maximize performance of multi-node GPU applications
  • Deliver high-quality code and documentation following best practices for open-source software development
  • Work with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools

PREFERRED EXPERIENCE:

  • Strong background developing system software in C/C++
  • Experience with at least one of the following:
    • Implementing communication middleware like MPI/SHMEM
    • Implementing lower-level communication frameworks like UCX and libfabric, or development using RDMA APIs
    • Development and optimization of communication collective algorithms (e.g. AllReduce)
  • Familiarity with GPU programming in HIP or CUDA
  • In-depth knowledge of the best practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning
  • Proven track record contributing to open-source projects

ACADEMIC CREDENTIALS:

  • Bachelor's degree in Computer Science, Electrical Engineering, or equivalent
  • Advanced degrees, such as M.Sc., M.Eng., Ph.D. are preferred.

LOCATION:

  • Santa Clara or San Jose CA

#LI-EV1

#LI-HYBRID


AMD does not accept unsolicited resumes from headhunters, recruitment agencies or fee based recruitment services. AMD and its subsidiaries are equal opportunity employers and will consider all applicants without regard to race, marital status, sex, age, color, religion, national origin, veteran status, disability or any other characteristic protected by law. EOE/MFDV




Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.