Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
A leading company in cloud solutions is seeking a PhD ML Infrastructure Engineer for its Distributed Training team. This role involves developing and optimizing machine learning models on AWS's specialized AI hardware. The ideal candidate will have a PhD in a relevant field, proficiency in C++ and Python, and experience with ML frameworks like PyTorch and JAX. Join us to work on cutting-edge technologies that help shape the future of machine learning.
Join to apply for the [PhD] ML Infrastructure Engineer - Distributed Training, AWS Neuron, Annapurna Labs role at Amazon Web Services (AWS).
By applying to this position, your application will be considered for all locations we hire for in the United States.
About Annapurna Labs
Annapurna Labs designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, delivering results that help our customers change the world.
About AWS Neuron
AWS Neuron is the complete software stack for the AWS Trainium and Inferentia cloud-scale Machine Learning accelerators. This role is for a Senior Machine Learning Engineer in the Distributed Training team for AWS Neuron, responsible for development, enablement, and performance tuning of various ML model families, including large-scale LLMs like GPT and Llama, as well as Stable Diffusion, Vision Transformers, and more.
Responsibilities
Qualifications
Basic:
Preferred:
Amazon is an equal opportunity employer and values diversity. We welcome applicants from all backgrounds and experiences.