THE ROLE:
You will work on AMD’s Reconfigurable Acceleration Platform boards. This is a peripheral component interconnect express (PCIe) compliant board featuring the newest AMD products. This AMD FPGA-based PCIe accelerator board is designed to accelerate compute-intensive applications like machine learning, data analytics, and video processing.
THE PERSON:
You will have strong written and verbal communication skills and a systematic approach to your job. A team player, you are used to delivering results in a fast-paced environment.
KEY RESPONSIBILITIES:
- Perform platform-level diagnostics and failure analysis across complex compute hardware, including accelerator cards, FPGA-based boards, DPUs, SmartNICs, and server-level platforms.
- Lead board-level fault isolation down to component level, covering electrical, functional, and high-speed interface failures.
- Support RMA investigations end-to-end, from initial screening to deep-dive fault isolation and cross-functional root cause alignment with firmware, driver, and silicon teams.
- Replicate customer system and application environments to reproduce failures and validate fixes in a controlled lab setup.
- Debug system-level interactions across multi-tray or multi-rack configurations, including power sequencing and communication links.
PREFERRED EXPERIENCE:
Technical Skill Requirements:
- Strong capability in component-level fault isolation, including use of schematics, board layouts, and BOMs to drive accurate diagnosis and correlate failures with potential root causes.
- Debug issues across power delivery, digital interfaces (I²C, SPI, PCIe, USB), high-speed SERDES links, and analog circuits; identify opens/shorts, marginal solder joints, passive failures.
- XJTAG diagnostic development: integrate and develop XJTAG boundary-scan test suites for interconnect tests, device verification, and improved diagnostic coverage.
- Deep knowledge of PCIe and Ethernet high-speed interfaces, signal integrity concepts, link training, and debug methodology.
- Hands‑on experience with iBERT or equivalent SerDes test platforms for high-speed channel characterization and margin analysis.
- Accelerator / FPGA Platform Diagnostics (Generalized ALVEO-Class Boards): diagnose issues related to FPGA configuration, on-board power, DDR/HBM interfaces; understand how boards interact in a system and impact functional performance and failure modes.
- Linux & System Debug: proficient with Ubuntu, CentOS, and related commands; network bring-up; basic scripting (Bash/Python) for automated diagnostics.
Soft Skills Requirements:
- Strong analytical and structured problem-solving ability.
- Clear communicator with ability to summarize technical findings for management or cross-functional teams.
- Collaborative and comfortable working with hardware, firmware, and system-level teams.
- Document diagnostic methodologies and share knowledge to build team capability.
ACADEMIC CREDENTIALS:
- Bachelor’s or Master’s degree in related discipline preferred
LOCATION:
Singapore