Enable job alerts via email!

Sr. Software Development Engineer, FAR (Frontier AI & Robotics)

Amazon

San Francisco (CA)

On-site

USD 120,000 - 180,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company at the forefront of robotics and AI. As a Senior Software Development Engineer, you'll work with leading experts to transform innovative research into high-performance production systems. Your role will involve optimizing large-scale transformer architectures, leveraging advanced tools like CUDA and TensorRT, and collaborating with scientists to ensure efficient model performance. This exciting opportunity offers a chance to make a significant impact in the field of robotics, where your contributions will drive the next generation of AI solutions. If you're passionate about technology and eager to tackle ambitious challenges, this role is perfect for you.

Qualifications

  • 5+ years of software development experience with strong programming skills.
  • Experience in optimizing ML models and leading engineering teams.

Responsibilities

  • Optimize large-scale foundation models using TensorRT and CUDA.
  • Collaborate with scientists to enhance model architectures and performance.

Skills

Python
C++
CUDA
Machine Learning Optimization
Software Development
Design Patterns
Performance Profiling

Education

Bachelor's degree in computer science or equivalent

Tools

TensorRT
NVIDIA Profiling Tools
ONNX Runtime
CUDA Graph

Job description

Sr. Software Development Engineer, FAR (Frontier AI & Robotics)

Job ID: 2914307 | Amazon.com Services LLC

Join the next revolution in robotics at Amazon's Frontier AI & Robotics team, where you'll work alongside world-renowned AI pioneers like Pieter Abbeel, Rocky Duan, and Peter Chen to make breakthrough foundation models run at production scale. As a Senior Machine Learning Engineer embedded in our science team, you'll be instrumental in transforming cutting-edge research into high-performance production systems. You'll collaborate directly with scientists to optimize large-scale transformer architectures for robotics applications, leveraging your expertise in CUDA and TensorRT to achieve unprecedented inference efficiency at Amazon scale.
In this role, you'll balance deep technical optimization work with strategic input on model architecture decisions, ensuring our innovative robotics models are designed with performance in mind from the ground up. You'll leverage NVIDIA's acceleration stack and other compilation techniques to tackle ambitious performance targets, working at the intersection of large language models and real-world robotics applications.

Key job responsibilities

  1. Drive inference optimization strategies for large-scale foundation models using TensorRT, CUDA, and other NVIDIA tools
  2. Collaborate closely with scientists to influence model architectures for optimal hardware utilization
  3. Design and implement efficient compilation pipelines for complex transformer architectures
  4. Develop comprehensive benchmarking frameworks to measure and optimize model performance
  5. Build robust monitoring solutions to ensure reliable model serving at scale
  6. Explore and evaluate emerging optimization techniques including ONNX Runtime and other ML compilers
  7. Maintain high engineering standards through proper testing, documentation, and code review practices

A day in the life
  1. Optimize transformer blocks using custom CUDA kernels and TensorRT optimization techniques
  2. Partner with scientists to analyze model architectures and propose efficiency improvements
  3. Implement and benchmark various optimization strategies for large-scale models
  4. Debug performance bottlenecks using NVIDIA profiling tools
  5. Participate in technical discussions about new model architectures with the science team
  6. Design and maintain performance monitoring systems for production deployment
  7. Prototype new acceleration approaches using emerging compilation frameworks

BASIC QUALIFICATIONS
  1. Bachelor's degree in computer science or equivalent
  2. 5+ years of non-internship professional software development experience
  3. 5+ years of programming with at least one software programming language experience
  4. 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  5. Experience as a mentor, tech lead or leading an engineering team
  6. Strong expertise in Python, C++ and CUDA programming
  7. Experience with TensorRT or similar ML optimization frameworks
  8. Track record of optimizing ML models for production

PREFERRED QUALIFICATIONS
  1. Expertise in NVIDIA's ML stack (cuDNN, CUDA Graph, etc.)
  2. Experience with ML compilers (ONNX Runtime, TVM, etc.)
  3. Experience with transformer model optimization
  4. Background in performance profiling and optimization
  5. Experience working directly with research teams
  6. Track record of building robust monitoring systems
  7. Experience with large-scale ML serving systems

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Sr. Software Development Engineer, FAR (Frontier AI & Robotics)

Amazon

San Francisco

On-site

USD 120,000 - 180,000

30+ days ago

Machine Learning Engineer III, FAR (Frontier AI & Robotics)

Amazon

San Francisco

On-site

USD 151,000 - 262,000

30+ days ago

Principal Applied Scientist, FAR (Frontier AI & Robotics)

Amazon

San Francisco

On-site

USD 120,000 - 180,000

29 days ago

Applied Scientist, FAR (Frontier AI & Robotics)

Amazon

San Francisco

On-site

USD 120,000 - 180,000

30+ days ago

Sr. Applied Scientist, FAR (Frontier AI & Robotics)

Amazon

Sunnyvale

On-site

USD 120,000 - 180,000

30+ days ago