Enable job alerts via email!

Principal/Senior Staff/Staff GPGPU Design Engineer

SQL Pager LLC

San Francisco (CA)

Hybrid

USD 120,000 - 200,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is seeking a Principal/Senior Staff GPGPU Design Engineer to join their dynamic team. This role involves developing cutting-edge GPGPU architectures aimed at optimizing latency for artificial intelligence applications. You will engage in defining micro-architectures, collaborating with algorithm teams, and ensuring seamless integration within complex SoCs. The position promises an intellectually stimulating environment where high standards are the norm, and your contributions will directly impact mission-critical projects. If you're passionate about pushing the boundaries of technology and thrive in a fast-paced setting, this opportunity is perfect for you.

Benefits

Medical insurance

Vision insurance

Dental insurance

401(k)

Qualifications

10+ years experience as a GPGPU Digital Design Engineer required.
Expertise in RTL Logic Design and low-power design essential.

Responsibilities

Develop GPGPU for AI computing architecture and define micro-architecture.
Collaborate with teams for SoC level integration and verification.

Skills

GPGPU Digital Design

RTL Logic Design

Low-Power Design

Scripting (Tcl, Python)

Micro-architecture Definition

Education

BSEE/BSCE or equivalent

Master’s degree in science

Tools

Verilog/System Verilog

UPF

CUDA/OpenCL

Principal/Senior Staff/Staff GPGPU Design Engineer

Client Overview
Client is building the first latency optimized SoC for their industry. Using its proven AI accelerator designs, it is targeting best in class latency with order of magnitude improvements for years to come.

Low Latency has become the key enabler for the niche and other real-time applications, and the current industry state-of-the-art is just not up to the task. Client has been developing its Neural Net Engines accelerators, optimizing it for Latency and achieving the best LPPA (Latency, Performance, Power, Area) in the field. We are now building the corresponding SoC to deliver unrivaled products to mission-critical and real-time applications.

This is a fast-paced, intellectually challenging position, and you will work with a talented team driven by innovation and excellence. You’ll have relentlessly high standards for yourself and everyone you work with, and you’ll be constantly looking for ways to improve our products' performance, quality, and cost.

We’re changing the meaning of low latency and we want individuals ready to rise up to the challenge and take the industry by storm.

Job Responsibilities

We are seeking a dedicated hands-on GPGPU Design Engineer to help develop a GPGPU for our artificial intelligence computing architecture.
As a Design Engineer, you will be participating in architecture definition and modeling, verification test plan, and testbench architecture.
You will be responsible for developing the micro-architecture specification, RTL in Verilog/System Verilog, performance/speed/power goals.
Collaborate with Algorithm and Verification teams to design various functional IPs in RISC-V based complex SoC.
Define a micro-architecture for the implementation and the usage of the functional block IP, possibly with externally sourced IPs.
Participate in SoC level integration and verifications.
Work with the Physical design team for timing closure.

Required Skills

10+ years (Principal) / 7+ years (Senior Staff) / 5+ years (Staff) of general experience as a GPGPU Digital Design Engineer.
Experience in converting a module-level micro-architecture definition from given marketing requirements.
Expert in RTL Logic Design, CDC, RDC, Scan insertion, Lint, LEC, and synthesis with timing constraints.
Experience in low-power design with UPF.
Scripting experience with Tcl, Python (or similar) language.

At least have gone through entire ASIC design phases; from micro-architecture to post-silicon bringing-up and validation.

In-depth knowledge of one of the parallel processing hardware architectures:

Neural Network Computation Flow on GPU/NPU
Array/Vector/Systolic Processors
SIMD/SIMT Processing Pipelines
Base Jump manycore
GPU cache hierarchy and latency-hiding with NoC interconnect bus fabric

Nice to have

Prior experience with GPU functional blocks, including caches.
Knowledge in various DNN networks and their core ASIC implementations.
Knowledge of Graphics/Compute APIs, such as CUDA/OpenCL.
Knowledge of OS, software/hardware interface.
Finite precision analysis of multi-layer compute processing pipeline.

Education
BSEE/BSCE or equivalent. Master’s degree in science is preferred, but not required.

Featured benefits

Medical insurance
Vision insurance
Dental insurance
401(k)

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.