Overview
Project description
This is a position within the AI GPU Software Group (AGS) responsible for AMD's ML SDK initiatives, with focus on development within the ROCm Profiling Tools for the AMD ROCm Stack. As a contributor to the success of AMD's products, you will be part of a leading team to drive and improve AMD's abilities to deliver the highest quality, industry-leading technologies to market.
Responsibilities
- Build GPU performance guided analysis with AI and productize cutting-edge academic profiling approach.
- Work closely with open source code base, SW/HW teams and product management to define the profiling tools requirement.
- Assistant architect to build profiling vision, strategy and analysis methodology.
- Design, code, test, and integrate features, enhancements, and bug fixes into the profiling tools stack
- Communicate and collaborate across many teams to coordinate features across the profiling tools stack
Must have
- Strong C/C++ and Python Development background
- GPU architecture
- Experience in performance analysis on large ML/HPC applications
- Experience in custom tools development on Open-Source platforms
- Experience with Linux, Docker, GitHub, and development environment
- Research experience in applying ML algorithms to performance analysis
- Experience with production software quality assurance practices, methodologies, and procedures
- Doctor's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
Nice to have
- Familiarity with performance analysis tools and methods is a plus
- Familiarity with SQL databases, building efficient queries, is a plus
- Excellent problem-solving skills and willingness to think outside the box
- Excellent communication skills and experience working with global teams
- Able to adapt quickly to new code bases and contribute production-level software to the profiling tools Engineering, Electrical Engineering, or equivalent