Enable job alerts via email!

Machine Learning Engineer (Large Systems)

Graphcore

Camden Town

On-site

GBP 60,000 - 80,000

Full time

Yesterday

Be an early applicant

Job summary

A leading AI technology company in Camden Town is looking for a Machine Learning Engineer to develop and optimize AI models. The successful candidate will work on large-scale systems, implement machine learning models, and collaborate with cross-functional teams. A background in Machine Learning or related field is essential. The role offers a competitive salary and a range of benefits, including flexible working and private medical insurance.

Benefits

Flexible working

Generous annual leave policy

Private medical insurance

Health cash plan

Dental plan

Pension (matched up to 5%)

Life assurance

Employee assistance programme

Snacks and healthy food options

Qualifications

Proficiency in deep learning frameworks such as PyTorch or JAX.
Strong Python or C++ software development skills.
Expertise in deep learning from model training to optimisation.

Responsibilities

Implement latest machine learning models and optimize them for performance.
Test and evaluate new internal software releases and provide feedback.
Collaborate with Research, Software, and Product teams.

Skills

Proficiency in deep learning frameworks such as PyTorch

Strong Python or C++ software development skills

Expertise in deep learning from model training to optimisation

Experience in distributed training or inference of ML models

Ability to explain complex technical concepts

Education

Bachelor’s, Master’s, PhD in Machine Learning or related field

Tools

Kubernetes

C++

As a Machine Learning Engineer in the Applied AI team at Graphcore, you will contribute to advancing AI technology by developing and optimizing AI models tailored to our specialized hardware. You will work on large-scale systems where performance is critical to the success of our projects. Working closely with the Software Development and Research teams, you will play a critical role in identifying Graphcore's technology.

We seek engineers with strong technical skills and an understanding of AI model implementation at scale, eager to make a tangible impact in this rapidly evolving field.

The Applied AI team is a proxy for our customers, ensuring that Graphcore's technology works seamlessly with the AI ecosystem and at scale. We build reference applications, contribute to key software libraries – e.g. optimizing kernels for efficiency on our hardware – and collaborate with the Research team to develop and publish novel ideas in domains such as efficient compute, model scaling, and distributed training and inference of AI models for multiple modalities and applications.

Responsibilities

Implement latest machine learning models and optimize them for performance and accuracy, scaling to thousands of accelerators.
Test and evaluate new internal software releases, provide feedback to software engineering teams, make necessary code fixes, and conduct code reviews.
Benchmark models and key ML techniques to identify performance bottlenecks and improve model efficiency.
Design and conduct experiments on novel AI methods, implement them, and evaluate results.
Collaborate with Research, Software, and Product teams to define, build, and test Graphcore's next‑generation AI hardware.
Engage with the AI community and stay current with the latest developments in AI.

Qualifications

Bachelor’s, Master’s, PhD or equivalent experience in Machine Learning, Computer Science, Mathematics, Data Science, or related field.
Proficiency in deep learning frameworks such as PyTorch or JAX.
Strong Python or C++ software development skills.
Expertise in deep learning from model training to optimisation and evaluation.
Experience in distributed training or inference of ML models across 64+ accelerators.
Capable of designing, executing, and reporting on ML experiments.
Developed deep understanding of performance bottlenecks and how to overcome them.
Ability to move quickly in a dynamic environment.
Enjoy cross-functional work collaborating with other teams.
Strong communicator – able to explain complex technical concepts to different audiences.

Desirable

Experience in one or more of the following areas:
MLOps for Kubernetes-based clusters.
Building production systems with large language models.
Efficient computing based on low‑precision arithmetic.
Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models.
Familiarity with HPC systems and networking, including Infiniband, NVLink, RoCE technologies.
Contribution to open-source projects or publication of research papers in relevant fields.
Knowledge of cloud computing platforms.
Keen to present, publish, and deliver talks in the AI community.

Benefits

In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar!

We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.