Enable job alerts via email!
A leading technology company is seeking systems and compiler engineers to optimize machine learning workloads for AWS accelerators. The role involves in-depth performance analysis, collaborating with customers, and implementing optimizations. Candidates should have at least 3 years of software development experience, a Bachelor's degree in computer science, and familiarity with ML frameworks like PyTorch and LLVM. This position offers a supportive work culture focused on mentorship and career growth.
The Annapurna Labs team at AWS builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on AWS custom machine learning accelerators, Inferentia and Trainium. The AWS Neuron SDK comprises an ML compiler, runtime, and application framework, which integrate into popular ML frameworks such as PyTorch. AWS Neuron running on Inferentia and Trainium is trusted and used by customers including Snap, Autodesk, and Amazon Alexa.
The Neuron Compiler team develops a deep learning compiler stack that targets state-of-the-art LLM, vision, and multi-modal models created in TensorFlow, PyTorch, and JAX, enabling them to run performantly on our accelerators. The team is composed of engineers across engineering, research, and product communities, aiming to provide a toolchain that delivers a quantum leap in performance.
The Neuron team is hiring systems and compiler engineers to solve our customers’ toughest problems. Specifically, the performance team in Toronto focuses on analysis and optimization of system-level performance of machine learning models on AWS ML accelerators. The team conducts in-depth profiling across multiple layers of the technology stack—from frameworks and compilers to runtime and collectives—to meet and exceed customer requirements while maintaining a competitive edge. As part of the Neuron Compiler organization, the team identifies and implements performance optimizations and works to crystallize these improvements into the compiler, automating optimizations for broader customer benefit.
This is an opportunity to work on products at the intersection of machine learning, high-performance computing, and distributed architectures. You will architect and implement business-critical features, publish research, and mentor a team of experienced engineers. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint; we are inventing and experimenting. The team works closely with customers on their model enablement, providing direct support and optimization expertise to ensure machine learning workloads achieve optimal performance on AWS ML accelerators.
Explore the product and our history:
Our performance engineers collaborate across compiler, runtime, and framework teams to optimize machine learning workloads for our global customer base. Working at the intersection of machine learning, high-performance computing, and distributed systems, you’ll bring a passion for performance analysis, distributed systems, and machine learning. In this role, you will:
We value diverse experiences and encourage candidates to apply even if you do not meet all of the qualifications. AWS emphasizes inclusion, benefits, and leadership principles. We support work-life balance, mentorship, and career growth.
Amazon is an equal opportunity employer and does not discriminate on the basis of protected status, veteran status, disability, or other legally protected status. If you require a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit the accommodations page for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.