Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
An innovative hiring platform is seeking an expert engineer to join their R&D team in revolutionizing AI model deployment. This role involves optimizing GPU performance and leading advancements in machine learning initiatives. You will work with cutting-edge technologies, including the Mako Compiler, to enhance AI inference and training across various hardware. If you possess a strong background in CUDA, ROCm, or Triton, along with proficiency in C/C++ and Python, this is a unique opportunity to make a significant impact in a forward-thinking environment. Join a team committed to pushing the boundaries of AI technology and enjoy a range of benefits designed to support your professional growth.
Scout AIis a new hiring platform that connects software engineers to opportunities with world-class companies. On Scout, you get a more relevant and growthful interviewing experience, you receive feedback on your performance, and you also get end-to-end support to improve your chances of getting hired.
If you perform well on the Scout interview, you become eligible for opportunities with all companies in the Scout network (only complete the interview once).
This role is with our partner company that is actively hiring:
Mako
Mako's AI platform reduces AI compute costs by up to 70%
Our breakthrough technology eliminates the need for expensive and manual GPU optimization, automatically generating high-performance code that runs efficiently on any hardware. Two core capabilities drive immediate business value:
Cost Optimization : Deploy AI models with up to 70% lower computing costs, directly improving your bottom line.
Universal Deployment : Run your existing AI models at peak performance across any GPU infrastructure, eliminating vendor lock-in and scaling constraints.
Mako delivers continuous, automated performance improvements without requiring changes to your existing code or hiring specialized engineers. Our intelligent compiler automatically optimizes your AI workloads 24/7, ensuring you maintain peak efficiency as your models and infrastructure evolve.
At the core of our platform is an innovative compiler that leverages hardware-aware deep learning-based search to automatically select from the growing ecosystem of vendor-provided and open-source GPU kernel libraries. Our compiler extends beyond library selection with optimization passes for both vertical and horizontal kernel fusion, enabling the generation of novel kernels outside the original search space.
Our roadmap includes extending the compiler to generate entirely new kernels from scratch. By integrating cutting-edge AI technologies into the compilation pipeline from day one, Mako is pioneering the next generation of modern compilation.
Our R&D team is focused on creating the most efficient engine for deploying generative AI models, with efforts ranging from precise GPU kernel tuning to comprehensive system optimizations.
We're looking for an expert level engineer with a strong background in either CUDA, ROCm, or Triton kernel optimization. Your role will involve leading substantial improvements in GPU performance and playing a key role in pioneering AI and machine learning initiatives.
Our team builds software infrastructure for high-performance AI inference and training on any hardware. There are three core components: