Enable job alerts via email!

Software Engineer, Systems ML - Frameworks / Compilers / Kernels | Ingénieur logiciel, Systèmes[...]

Meta

Toronto

On-site

CAD 80,000 - 140,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a key player in developing cutting-edge AI compiler strategies. This role offers the chance to work on the industry-leading PyTorch AI framework, optimizing performance and deployment for advanced deep learning models. Collaborate closely with AI researchers and hardware design teams to create innovative solutions that push the boundaries of machine learning on next-generation hardware. If you are passionate about AI and eager to contribute to groundbreaking projects, this opportunity is perfect for you.

Qualifications

  • 7+ years of experience in AI framework development or deep learning model acceleration.
  • Strong knowledge of GPU, CPU, or AI hardware architectures.

Responsibilities

  • Develop SW stack focusing on AI frameworks and compiler optimizations.
  • Collaborate with AI researchers to enhance deep learning models.

Skills

AI framework development
CUDA programming
Performance tuning
Machine learning optimization
Deep learning model analysis

Education

Bachelor's degree in Computer Science
Master's degree in Computer Science
PhD in Computer Science

Tools

PyTorch
TensorFlow
Caffe2
ONNX
TensorRT
MLIR
LLVM

Job description

Meta

Toronto

CAD 80,000 - 140,000

In this role, you will be a member of the MTIA (Meta Training & Inference Accelerator) Software team and part of the bigger industry-leading PyTorch AI framework organization. MTIA Software Team has been developing a comprehensive AI Compiler strategy that delivers a highly flexible platform to train & serve new DL/ML model architectures, combined with auto-tuned high performance for production environments across specialized hardware architectures. The compiler stack, DL graph optimizations, and kernel authoring for specific hardware, directly impacts performance and deployment velocity of both AI training and inference platforms at Meta. You will be working on one of the core areas such as PyTorch framework components, AI compiler and runtime, high-performance kernels and tooling to accelerate machine learning workloads on the current & next generation of MTIA AI hardware platforms. You will work closely with AI researchers to analyze deep learning models and lower them efficiently on MTIA hardware. You will also partner with hardware design teams to develop compiler optimizations for high performance. You will apply software development best practices to design features, optimization, and performance tuning techniques. You will gain valuable experience in developing machine learning compiler frameworks and will help in driving next generation hardware software codesign for AI domain specific problems.

Responsibilities
  • Development of SW stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures.
  • Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance.
  • Analyze deep learning networks, develop & implement compiler optimization algorithms.
  • Collaborate with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc.
  • Performance tuning and optimizations of deep learning framework & software components.
  • Experience in AI framework development or accelerating deep learning models on hardware architectures.
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
  • A Bachelor's degree in Computer Science, Computer Engineering, relevant technical field and 7+ years of experience in AI framework development or accelerating deep learning models on hardware architectures OR a Master's degree in Computer Science, Computer Engineering, relevant technical field and 4+ years of experience in AI framework development or accelerating deep learning models on hardware architectures OR a PhD in Computer Science Computer Engineering, or relevant technical field and 3+ years of experience in AI framework development or accelerating deep learning models on hardware architectures.
  • Knowledge of GPU, CPU, or AI hardware accelerator architectures.
  • Experience working with frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT.
  • OR AI high performance kernels: Experience with CUDA programming, OpenMP / OpenCL programming or AI hardware accelerator kernel programming. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN, CUTLASS, HIP, ROCm etc.
  • OR AI Compiler: Experience with compiler optimizations such as loop optimizations, vectorization, parallelization, hardware specific optimizations such as SIMD. Experience with MLIR, LLVM, IREE, XLA, TVM, Halide is a plus.
  • OR AI frameworks: Experience in developing training and inference framework components. Experience in system performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal Enterprise Architect

Salesforce, Inc.

Toronto

Remote

CAD 90.000 - 150.000

14 days ago

Senior Software Engineer, Core Experience

Instacart

Ontario

Remote

CAD 90.000 - 120.000

2 days ago
Be an early applicant

Solution Architect

Lovelytics, LLC

Ontario

Remote

CAD 100.000 - 150.000

Today
Be an early applicant

Software Engineer II (Merchant Risk Intelligence & Platform)

Affirm

Toronto

Remote

CAD 125.000 - 175.000

20 days ago

Sr. Engineer, System Firmware

Tenstorrent

Old Toronto

Remote

CAD 80.000 - 150.000

30+ days ago

Agentic AI Systems Engineer/AI Solutions Engineer

ZipRecruiter

Toronto

Hybrid

CAD 90.000 - 120.000

4 days ago
Be an early applicant

Senior Software Engineer - Attack Detection

Abnormal Security Corporation

Remote

CAD 100.000 - 130.000

6 days ago
Be an early applicant

Software Engineer II (Merchant Risk Intelligence & Platform)

Affirm

Ottawa

Remote

CAD 125.000 - 175.000

Today
Be an early applicant

Software Engineer II (Merchant Risk Intelligence & Platform)

Affirm

Kitchener

Remote

CAD 125.000 - 175.000

Yesterday
Be an early applicant