Enable job alerts via email!

Software Development Engineer SGLang and Inference Stack

Advanced Micro Devices

Vancouver

On-site

CAD 90,000 - 120,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading semiconductor company based in Vancouver is seeking an experienced engineer to optimize deep learning frameworks for AMD GPUs. The role involves enhancing GPU kernel performance and collaborating with internal teams and open-source communities. Candidates should possess strong skills in GPGPU C++, Triton, and software best practices, along with a Bachelor's or Master's in a relevant field. This position promises an opportunity to influence AMD's AI software ecosystem significantly.

Benefits

Benefits package

Collaborative work environment

Qualifications

Proficient in programming with C++ and/or Python.
Hands-on experience with LLM optimization frameworks is highly preferred.
Familiarity with compiler design and GPU architectures is a plus.

Responsibilities

Optimize performance of deep learning frameworks on AMD GPUs.
Develop and tune large-scale training and inference models.
Design and implement high-performance GPU kernels.

Skills

GPGPU C++

Triton

TileLang

Software Engineering Best Practices

Education

Bachelor’s or Master’s Degree in Computer Science or related field

Tools

TensorFlow

PyTorch

HIP

CUDA

Overview

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

The Role

As a core member of the team, you will play a pivotal role in optimizing and developing deep learning frameworks for AMD GPUs. Your work will be instrumental in enhancing GPU kernel performance, accelerating deep learning models, and enabling RL training and SOTA LLM and multimodal inference at scale across multi-GPU and multi-node systems. You will collaborate across internal GPU software teams and engage with open-source communities to integrate and optimize cutting-edge compiler technologies and drive upstream contributions that benefit AMD’s AI software ecosystem.

The Person

Skilled engineer with strong technical and analytical expertise in GPGPU C++, Triton, TileLang or DSL development within Linux environments. The ideal candidate will thrive in both collaborative team settings and independent work, with the ability to define goals, manage development efforts, and deliver high-quality solutions. Strong problem-solving skills, a proactive approach, and a keen understanding of software engineering best practices are essential.

Key Responsibilities

Optimize Deep Learning Frameworks: Enhance performance of frameworks like TensorFlow, PyTorch, and SGLang on AMD GPUs via upstream contributions in open-source repositories.
Develop and Optimize Deep Learning Models: Profile, analyze, code changes and tune large-scale training and inference models for optimal performance on AMD hardware. Day-0 support to many SOTA models, DeepSeek 3.2, Kimi K2.5, etc.
GPU Kernel Development: Design, implement, and optimize high-performance GPU kernels using HIP, Triton, TileLang or other DSLs for AI operator efficiency.
Collaborate with GPU Library and Compiler Teams: Work closely with internal compiler and GPU math library teams to integrate, optimize and align kernel-level optimizations with full-stack performance goals. Initiate and help with different level codegen optimizations.
Contribute to SGLang Development: Support optimization, feature development, and scaling of the SGLang framework across AMD GPU platforms for LLM, multimodal serving and RL-training.
Distributed System Optimization: Tune and scale performance across both multi-GPU (scale-up) and multi-node (scale-out) environments, including inference parallelism, prefill-decode disaggregation, Wide-EP and collective communication strategies.
Graph Compiler Integration: Integrate and optimize runtime execution through graph compilers such as XLA, TorchDynamo, or custom pipelines.
Open-Source Collaboration: Partner with external maintainers to understand framework needs, propose optimizations, and upstream contributions effectively.
Apply Engineering Best Practices: Leverage modern software engineering practices in debugging, profiling, test-driven development, and CI/CD integration.

Preferred Experience

Strong Programming Skills: Proficient in C++ and/or Python (PyTorch, Triton, TileLang), with demonstrated ability to code, debug, profile, and optimize performance-critical code.
SGLang and LLM Optimization: Hands-on experience with SGLang or similar LLM inference frameworks is highly preferred.
Compiler and GPU Architecture Knowledge: Background in compiler design or familiarity with technologies like LLVM, MLIR, or ROCm is a plus.
Heterogeneous System Workloads: Experience running and scaling workloads on large-scale, heterogeneous clusters (CPU + GPU) using distributed training or inference strategies.
AI Framework Integration: Experience contributing to or integrating optimizations into deep learning frameworks such as PyTorch, SGLang, vLLM, Slime, VeRL
GPGPU Computing: Working knowledge of HIP, CUDA, Triton, TileLang or other GPU programming models; experience with GCN/CDNA architecture preferred.

Academic Credentials

Bachelor’s and/or Master’s Degree in Computer Science, Computer Engineering, Electrical Engineering, Physics or a related field.

#LI-JG1

Benefits

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.

This posting is for an existing vacancy.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions