Job Search and Career Advice Platform

Enable job alerts via email!

Lead Inference Performance Engineer

Cerebras

Toronto

On-site

CAD 90,000 - 120,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading AI technology firm in Toronto is seeking skilled engineers for their inference performance team. This role focuses on optimizing AI model inference speed, working with hardware and software interactions. Ideal candidates will have a strong background in computer architecture, experience with performance profiling, and proficiency in C++ and Python. Join a groundbreaking team and contribute to innovative AI solutions within a dynamic and inclusive culture.

Benefits

Collaborative work culture
Job stability with startup vitality
Opportunities for continuous learning

Qualifications

  • Experience in Computer Architecture, CPU/GPU Performance, Kernel Optimization, or HPC.
  • Understanding of low-level deep learning / LLM math.
  • 3+ years of relevant experience.

Responsibilities

  • Build performance models for ML models.
  • Optimize and debug kernel micro code to improve inference speed.
  • Understand and debug runtime performance.

Skills

Computer architecture
C++
Python
Problem-solving
Performance profiling

Education

Bachelors / Masters / PhD in Electrical Engineering or Computer Science

Tools

CPU/GPU simulators
Job description
A leading AI technology firm in Toronto is seeking skilled engineers for their inference performance team. This role focuses on optimizing AI model inference speed, working with hardware and software interactions. Ideal candidates will have a strong background in computer architecture, experience with performance profiling, and proficiency in C++ and Python. Join a groundbreaking team and contribute to innovative AI solutions within a dynamic and inclusive culture.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.