Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
An established industry player is seeking a senior Technical Expert to join their innovative team focused on CPU/NPU development. This role involves engaging in advanced workload modelling, driving software/hardware integration, and planning co-optimization features for next-generation processors. The ideal candidate will possess extensive experience in CPU/NPU architecture, performance projection, and the use of cutting-edge simulators. Join a dynamic environment where your contributions will significantly impact the future of high-performance computing. If you are passionate about technology and eager to work on groundbreaking projects, this opportunity is perfect for you.
Job Description
The role we are seeking to fill is for a senior Technical Expert with a profound background in workload modelling and CPU/NPU Architecture.
This expert will be a key member of a team at the forefront of CPU/NPU development. Responsibilities will include planning and implementing tool systems for architectural exploration and performance analysis. Additionally, the role involves driving software/hardware vertical integration and planning software/hardware co-optimization features for next- processors.
The ideal candidate should possess a strong understanding of CPU/NPU architecture and workload extraction, as well as a good grasp of compiler, binary analysis, and software/hardware co-optimization.
Key Responsibilities:
Engage in the investigation of cutting-edge, high-performance server CPU/NPU core and SOC architecture design, contributing vital data support for crucial decision-making processes.
Design and execute the implementation of relevant tool systems for the exploration of architecture and the analysis of performance.
Develop strategies for software/hardware co-optimization features and lead the integration of software and hardware components for the next processor.
Construct a non-intrusive, highly accurate system for characterizing and modelling complex workloads, ensuring precise workload representation.
Analyse and extract the distinctive features of real-world scenario workloads, delivering essential insights to our in-house chip development department.
Required:
Possess extensive industry experience in workload modelling and the development of CPU/NPU architecture.
Skilled in performance projection and architectural exploration using SoC simulators.
Proficient in the development of Slicing Tools.
Skilled in developing and utilizing performance simulators, including GEM5 (O3 model), Sniper, and others.
Proficient in benchmark analysis and characterization.
Experience in GPGPU performance analysis
Great knowledge of theory and practice of deep learning, computer vision, natural language processing, or computer graphics
Strong programming skills in such as C++ and Python. Experience with frameworks like TensorFlow, PyTorch
Strong grasp of binary analysis, and software/hardware co-optimization techniques.
Excellent collaboration and interpersonal skills
Considered as a plus
Experience in developing for QEMU and DynamoRIO (or x86 PIN).
Experience in developing and using performance simulators like GEM5 (O3 model), Sniper or others
Experience with CUDA or OpenCL programming is a plus