Enable job alerts via email!

Snr/ Engineer - Product Development (Failure Analysis)

ADVANCED MICRO DEVICES (SINGAPORE) PTE LTD

Singapore

On-site

SGD 50,000 - 90,000

Full time

Yesterday
Be an early applicant

Job summary

A leading technology company in Singapore is seeking a qualified individual to perform GPU Failure Analysis. The position involves conducting system-level tests, collaborating with engineering teams, and documenting findings. Candidates should possess a degree in Electrical or Computer Engineering and demonstrate strong analytical and programming skills in languages like C++, Python, and C#.

Qualifications

  • Strong in silicon or board level debug or Failure Analysis knowledge.
  • Self-starter, analytical, and detail-oriented.
  • Excellent programming skills and familiarity with Linux.

Responsibilities

  • Perform Failure Analysis on GPU products for customer returns.
  • Conduct System-Level Tests to replicate failures.
  • Collaborate with engineering teams for root cause analysis.
  • Document findings and report writing for customer communication.

Skills

Failure Analysis knowledge
Analytical skills
Code development
Debugging complex systems
Familiarity with Linux
Knowledge of PCIE, USB
Programming in C++, C#, Python
MS Excel proficiency

Education

Bachelor’s or Master’s degree in Electrical and Electronic Engineering or Computer Engineering

Tools

Automated Test Equipment (ATE)
Advantest 93k platform
Multimeters
Oscilloscopes
Job description
WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE:

Returns Debug and RMA execution in Quality & Reliability organization, provide supportive functions to the organization to ensure customer quality issues are being addressed, evoking the required actions via failure analysis to improve product quality.

THE PERSON:

You will also need to possess strong verbal and written communication skills, which are essential when working with a global team. A proactive, outstanding teammate who focuses on teamwork, team building, and growing team success. AMD's environment is fast-paced, results-oriented, and built upon a legion of forward-thinking people with a passion for winning technology!

KEY RESPONSIBILITIES:
  • Perform PCBA-level and System-level Failure Analysis (FA) on GPU products, covering customer returns and field failures.
  • Conduct System-Level Test (SLT) using internal test board or on customer platform to duplicate customer reported failures and isolate the cause of failure.
  • Perform electrical FA using Automated Test Equipment (ATE) and fault isolation work at silicon level.
  • Collaborate with Device Analysis and Product Engineering teams for in-depth package/die-level investigations, fault isolation, and root cause analysis.
  • Investigate Excursion and Critical Issues, supporting DPPM (VF/NFF) improvement initiatives.
  • Drive test coverage analysis and enhancements to improve failure detection and mitigation.
  • Partner with validation, firmware, and hardware teams to resolve hardware, software, and platform issues.
  • Innovate, prototype, and evaluate new FA tools to improve GPU failure analysis capabilities.
  • Develop functional automation environments for test and stress software to enhance debug efficiency.
  • Provide technical assistance, resources, and equipment to support engineering teams in testing and debugging activities.
  • Plan, set up, and install server racks with air and liquid cooling capabilities for advanced test infrastructure.
  • Work closely with program managers and product line quality (PLQ)/customer Interacting teams to align failure analysis report writing with external customer communication.
  • Document debug findings, root cause analysis, and corrective actions in clear, concise technical reports.
  • Serve as the local product owner, responsible for tracking and releasing ATE, SLT, and OSV test programs related to RMA and OSV programs.
  • Act as the Go-To technical expert for owned products, supporting test program contents, FA methodologies, and customer queries.
  • Proactively identify opportunities for process improvement, code quality enhancements, and hardware coverage expansion.
  • Other duties as assigned by supervisor.
PREFERRED EXPERIENCE:
  • Strong in either silicon or board level debug or Failure Analysis knowledge.
  • Analytical and detail-oriented, strongly interested in debugging complex systems, self-starter, and a fast learner
  • Excellent skill in code development, familiarity with Linux and modern software tools/benchmarks and techniques for development.
  • System Level Test and/or x86 architecture knowledge is much preferred
  • Knowledge of ATE test methodology, experience with the Advantest 93k platform is a plus.
  • Experience working with power management features such as AVFS, P-states, etc.
  • Knowledge of industry standards like PCIE, USB, or high bandwidth memory is a strong plus
  • JTAG knowledge is a plus.
  • Strong understanding of SCAN or BIST test is a plus.
  • Experience programming experience with C++, C#, Python, shell script, or similar.
  • Experience with PC HW debugging, including voltage, networking, storage, and thermal control.
  • Experience with building computer systems (desktops, laptops, servers, etc).
  • Experience in Servers installation, configuration, and maintenance.
  • Good analytical and problem-solving skills.
  • Strong understanding of hardware debugging, test equipment (multimeters, oscilloscopes, thermal cameras, etc.), and system-level troubleshooting.
  • Proficient in reading electronic schematics and electronic component datasheets.
  • Proficient in MS Excel (i.e. Pivot tables, Charts, Power Query)
ACADEMIC CREDENTIALS:
  • Bachelor’s or Master’s degree in Electrical and Electronic Engineering Computer Engineering with relevant experience
LOCATION:

Singapore

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.