Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
Join a forward-thinking company as a Senior Kernel Engineer, where you'll collaborate with top engineers to develop software that optimizes AI inference on cutting-edge hardware. Your role will involve implementing neural network compute kernels, improving kernel library abstractions, and working closely with compiler teams to enhance performance. If you're passionate about solving complex problems and have a strong background in Python and C/C++, this is the perfect opportunity to make a significant impact in the AI field.
We are looking for best in class engineers to join our existing top-notch team. When you join us, you will be part of a team that designs, develops and verifies the software that interacts with our chip, collaborating with our hardware engineers and with fellow software engineers in the process. By creating software that fully realizes the capabilities of the hardware, you will help get AI inference to the general populace.
As part of this exceptional team, you are able to - and get excited by - identifying functional/performance bottlenecks and how to alleviate them in order to achieve scalable and reliable software. You excel in an environment with complex software and hardware designs.
We are looking for an experienced Senior Kernel Engineer who can help build and optimize our SDK. Our tools and libraries unlock industry-leading performance and power efficiency on our unique at-memory AI inference chips. We enable customers to compile models directly to run on our architectures, and provide tools to analyze and optimize performance.
The kernel library is at the heart of our SDK, leveraging HW features for fast computations, dividing work flexibly amongst parallel computation elements, as well as providing highly configurable data-flow options for all of our kernels.
The successful candidate will build a deep understanding of the capabilities and limitations of the architecture, and of how features of the kernel library enable performant push-button compilations.