Enable job alerts via email!

ML Compiler Engineer , AWS Neuron, Annapurna Labs

Amazon Jobs

Toronto

On-site

CAD 90,000 - 120,000

Full time

2 days ago

Be an early applicant

Job summary

A leading technology company is seeking systems and compiler engineers to optimize machine learning workloads for AWS accelerators. The role involves in-depth performance analysis, collaborating with customers, and implementing optimizations. Candidates should have at least 3 years of software development experience, a Bachelor's degree in computer science, and familiarity with ML frameworks like PyTorch and LLVM. This position offers a supportive work culture focused on mentorship and career growth.

Benefits

Support for work-life balance

Mentorship opportunities

Qualifications

3+ years of non-internship professional software development experience.
2+ years of design or architecture experience of systems.
Experience programming with at least one software language.
3+ years of full software development life cycle experience.
Experience in compiler design for various engines.
Experience with system-level performance analysis and optimization.

Responsibilities

Analyze and optimize system-level performance of ML models.
Conduct detailed performance analysis of ML workloads.
Work directly with customers to optimize their ML models.
Design and implement compiler optimizations.
Collaborate across teams to enhance performance capabilities.

Skills

Performance analysis

Distributed systems

Machine learning

Education

Bachelor’s degree in computer science

Tools

LLVM

MLIR

PyTorch

Overview

The Annapurna Labs team at AWS builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on AWS custom machine learning accelerators, Inferentia and Trainium. The AWS Neuron SDK comprises an ML compiler, runtime, and application framework, which integrate into popular ML frameworks such as PyTorch. AWS Neuron running on Inferentia and Trainium is trusted and used by customers including Snap, Autodesk, and Amazon Alexa.

Team and Role

The Neuron Compiler team develops a deep learning compiler stack that targets state-of-the-art LLM, vision, and multi-modal models created in TensorFlow, PyTorch, and JAX, enabling them to run performantly on our accelerators. The team is composed of engineers across engineering, research, and product communities, aiming to provide a toolchain that delivers a quantum leap in performance.

The Neuron team is hiring systems and compiler engineers to solve our customers’ toughest problems. Specifically, the performance team in Toronto focuses on analysis and optimization of system-level performance of machine learning models on AWS ML accelerators. The team conducts in-depth profiling across multiple layers of the technology stack—from frameworks and compilers to runtime and collectives—to meet and exceed customer requirements while maintaining a competitive edge. As part of the Neuron Compiler organization, the team identifies and implements performance optimizations and works to crystallize these improvements into the compiler, automating optimizations for broader customer benefit.

This is an opportunity to work on products at the intersection of machine learning, high-performance computing, and distributed architectures. You will architect and implement business-critical features, publish research, and mentor a team of experienced engineers. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint; we are inventing and experimenting. The team works closely with customers on their model enablement, providing direct support and optimization expertise to ensure machine learning workloads achieve optimal performance on AWS ML accelerators.

Explore the product and our history:

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/index.html
https://aws.amazon.com/machine-learning/neuron/
https://github.com/aws/aws-neuron-sdk
https://www.amazon.science/how-silicon-innovation-became-the-secret-sauce-behind-awss-success

Key responsibilities

Our performance engineers collaborate across compiler, runtime, and framework teams to optimize machine learning workloads for our global customer base. Working at the intersection of machine learning, high-performance computing, and distributed systems, you’ll bring a passion for performance analysis, distributed systems, and machine learning. In this role, you will:

Analyze and optimize system-level performance of machine learning models across the entire technology stack, from frameworks to runtime
Conduct detailed performance analysis and profiling of ML workloads, identifying and resolving bottlenecks in large-scale ML systems
Work directly with customers to enable and optimize their ML models on AWS accelerators, understanding their specific requirements and use cases
Design and implement compiler optimizations, transforming manual performance improvements into automated compiler passes
Collaborate across teams to develop innovative optimization techniques that enhance AWS Neuron SDK\'s performance capabilities
Work in a startup-like development environment, where you’re always working on the most important stuff

About the team

We value diverse experiences and encourage candidates to apply even if you do not meet all of the qualifications. AWS emphasizes inclusion, benefits, and leadership principles. We support work-life balance, mentorship, and career growth.

Qualifications

3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture experience of new and existing systems
Experience programming with at least one software programming language
3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Bachelor’s degree in computer science or equivalent
Experience in compiler design for CPU/GPU/Vector engines/ML-accelerators
Experience with system-level performance analysis and optimization
Experience with LLVM and/or MLIR
Experience with PyTorch, OpenXLA, StableHLO, JAX, TVM, deep learning models, and related algorithms

EEO and accommodations

Amazon is an equal opportunity employer and does not discriminate on the basis of protected status, veteran status, disability, or other legally protected status. If you require a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit the accommodations page for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

ML Compiler Engineer , AWS Neuron, Annapurna Labs

Amazon Jobs

Toronto

On-site

CAD 90,000 - 120,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Company

Services

Free resources

Support

ML Compiler Engineer , AWS Neuron, Annapurna Labs

Amazon Jobs

Toronto

On-site

CAD 90,000 - 120,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Follow us

Company

Services

Free resources

Support