Aktiviere Job-Benachrichtigungen per E-Mail!

Senior Applied Scientist - Systems for ML Inference and Training Optimization, Deep Science for[...]

Amazon

Tübingen

Vor Ort

EUR 85.000 - 120.000

Vollzeit

Heute

Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A leading technology company is seeking a Senior Applied Scientist specializing in ML Systems to optimize performance and cost-efficiency of AWS ML services. The ideal candidate has a PhD and over 5 years of experience in performance optimization, with deep expertise in CUDA programming and systems-level innovation. This role offers the chance to impact ML training and inference across AWS's global platform.

Qualifikationen

5+ years of hands-on experience in performance optimization for AI/ML workloads.
Expert-level proficiency in CUDA programming.
Proven track record of delivering measurable performance improvements in production systems.

Aufgaben

Design and implement kernel-level optimizations for ML inference and training workloads.
Drive performance improvements in latency and throughput for production ML systems.
Lead the design and delivery of scientifically-complex optimization solutions.

Kenntnisse

CUDA programming

Performance optimization

C/C++ programming

Performance profiling tools

Ausbildung

PhD in Computer Science or related field

Tools

NVIDIA Nsight

AWS Neuron SDK

Senior Applied Scientist – ML Systems, Training & Inference Optimization

We are seeking an exceptional Senior Applied Scientist specializing in ML Systems, training, and inference optimization to join DS3. This role requires deep expertise in performance engineering, kernel development, distributed systems optimization, and AI workload optimization across heterogeneous compute platforms. You will invent and implement novel optimization techniques that directly impact the performance and cost‑efficiency of ML training and inference for AWS customers worldwide.

As a Senior Applied Scientist in DS3, you will work at the lowest levels of the software stack—writing custom CUDA kernels, optimizing PTX assembly, developing high‑performance operators for GPUs and AWS Neuron, designing efficient communication patterns for multi‑GPU and multi‑node training, and inventing new algorithmic approaches to accelerate transformer models and emerging architectures. Your work will span from single‑node inference optimization to large‑scale distributed training systems, influencing the design of AWS training and inference services and setting new standards for ML systems performance across the industry.

Deep Science for Systems and Services (DS3) is a part of AWS Utility Computing (UC) which provides product innovations—from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry.

Key Job Responsibilities

Systems‑Level Scientific Innovation: Design and implement novel kernel‑level optimizations for ML inference and training workloads, including custom CUDA kernels, PTX‐level optimizations, and cross‑platform acceleration for CUDA and AWS Neuron SDK.
Performance Engineering Leadership: Drive 2–10× performance improvements in latency, throughput, and memory efficiency for production ML inference & training systems through systematic profiling, analysis, and optimization.
Cross‑Platform Optimization: Develop and port high‑performance ML operators across GPUs, AWS Inferentia/Trainium, and emerging AI accelerators, ensuring optimal performance on each platform.
Product‑Level Impact: Lead the design, implementation, and delivery of scientifically‑complex optimization solutions that directly improve customer experience and reduce AWS operational costs at scale.
Scientific Rigor: Produce technical documentation and internal research reports demonstrating the correctness, efficiency, and scalability of your optimizations. Contribute to external publications when aligned with business needs.
Technical Leadership: Influence your team's technical direction and scientific roadmap. Build consensus across engineering and science teams on optimization strategies and architectural decisions.
Mentorship & Knowledge Sharing: Actively mentor junior scientists and engineers on performance engineering best practices, kernel development, and systems‑level optimization techniques.

Qualifications

PhD in Computer Science, Computer Engineering, or related technical field, OR Master’s degree with 8+ additional years of relevant research/industry experience.
5+ years of hands‑on experience in performance optimization and systems programming for AI/ML workloads.
Expert‑level proficiency in CUDA programming and GPU architecture, with demonstrated ability to write high‑performance custom kernels.
Proven track record of delivering measurable performance improvements (2× or greater) in production systems.
Strong C/C++ programming skills with experience in performance profiling tools such as NVIDIA Nsight, Linux Perf, or similar diagnostic frameworks.
Experience optimizing inference and/or training for large language models (LLMs) and transformer‑based architectures, including MoE models, at scale.
Hands‑on experience with AWS Neuron SDK, or other non‑NVIDIA AI acceleration platforms.
Track record of optimizing ML workloads across diverse hardware: embedded devices (ARM Cortex, DSPs, NPUs) and data center GPUs (NVIDIA Ampere/Hopper).
Experience with low‑level optimization techniques including assembly‑level tuning (NVIDIA PTX, x86/ARM assembly) and cross‑platform kernel porting.
Experience leading performance optimization initiatives that resulted in significant cost savings or multi‑million dollar business impact.
Proven ability to mentor and train engineers in performance engineering and low‑level optimization (5+ team members or workshop instruction).
Entrepreneurial experience or track record of driving technical vision in startup, co‑founder, or product development environments.

Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice to know more about how we collect, use and transfer the personal data of our candidates.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Städte

Top-Unternehmen

Beliebte Jobs