Enable job alerts via email!

Staff Scientific HPC Engineer

Altos Labs

Cambridge

Hybrid

GBP 60,000 - 80,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in cell health and rejuvenation seeks a Staff Scientific HPC Engineer to manage and enhance their high-performance computing systems. The ideal candidate will have substantial HPC experience, focusing on GPU-accelerated computing and supporting large-scale AI/ML projects. Join a culture that values diversity and innovation to make impactful contributions to scientific excellence.

Qualifications

  • Relevant experience in HPC.
  • Extensive experience building and optimizing HPC systems.
  • Experience with distributed systems networking.

Responsibilities

  • Manage high-performance computing systems infrastructure.
  • Collaborate with research teams to meet computation needs.
  • Monitor system performance and troubleshoot issues.

Skills

Problem-solving
Communication
Collaboration

Education

Bachelor’s degree in computer science, Information Technology, or related field

Tools

CUDA
OpenCL
TensorFlow
PyTorch

Job description

Social network you want to login/join with:

Staff Scientific HPC Engineer, Cambridge

col-narrow-left

Client:
Location:

Cambridge, United Kingdom

Job Category:

Other

-

EU work permit required:

Yes

col-narrow-right

Job Reference:

3e770cd415e3

Job Views:

27

Posted:

17.06.2025

Expiry Date:

01.08.2025

col-wide

Job Description:

Our Mission

Our mission is to restore cell health and resilience through cell rejuvenation to reverse disease, injury, and the disabilities that can occur throughout life.

Our Value

Diversity at Altos

We believe that diverse perspectives are foundational to scientific innovation and inquiry.

We are building a company where exceptional scientists and industry leaders from around the world work side by side to advance a shared mission.

Our intentional focus is on Belonging, so that all employees know that they are valued for their unique perspectives.

At Altos, we are all accountable for sustaining a diverse and inclusive environment.

What You Will Contribute To Altos

The Altos Labs team is seeking a Systems Engineer and Administrator specializing in high-performance (HPC) and GPU-accelerated computing to work closely with the Scalable Modeling team and help manage the scientific compute infrastructure of our organization. This role will lead the design, implementation, and support of our high-performance computing systems, with a focus on GPU-accelerated computation and high-performance storage. Candidates for this role will work closely with our research and engineering teams to operate infrastructure and platforms for large-scale AI/ML model training and inference, ensuring efficient and reliable operation of our HPC infrastructure.

Responsibilities

Design, implement, and manage high-performance computing systems infrastructure, focusing on GPU compute capabilitiesConfigure and maintain high-performance networking and storage infrastructure to support low latency, high throughput distributed computationCollaborate with ML research and engineering teams to understand and meet their accelerated computation needs, ensuring next-generation infrastructure support for emerging trends in AI/ML, including foundation models and LLMsMonitor system performance, troubleshoot and address issues to ensure high availability and optimal performanceDevelop and maintain system documentation, including hardware/software configurations, troubleshooting guides, and operational procedures.Conduct training sessions or workshops to educate users on the proper use of scientific computation infrastructure.Stay up to date with the latest trends and advancements in HPC and GPU technologies and advise senior leadership on procurement strategies for next-generation hardware and solutions
  • What You Will Contribute To Altos

    The Altos Labs team is seeking a Systems Engineer and Administrator specializing in high-performance (HPC) and GPU-accelerated computing to work closely with the Scalable Modeling team and help manage the scientific compute infrastructure of our organization. This role will lead the design, implementation, and support of our high-performance computing systems, with a focus on GPU-accelerated computation and high-performance storage. Candidates for this role will work closely with our research and engineering teams to operate infrastructure and platforms for large-scale AI/ML model training and inference, ensuring efficient and reliable operation of our HPC infrastructure.

    Responsibilities

    Design, implement, and manage high-performance computing systems infrastructure, focusing on GPU compute capabilitiesConfigure and maintain high-performance networking and storage infrastructure to support low latency, high throughput distributed computationCollaborate with ML research and engineering teams to understand and meet their accelerated computation needs, ensuring next-generation infrastructure support for emerging trends in AI/ML, including foundation models and LLMsMonitor system performance, troubleshoot and address issues to ensure high availability and optimal performanceDevelop and maintain system documentation, including hardware/software configurations, troubleshooting guides, and operational procedures.Conduct training sessions or workshops to educate users on the proper use of scientific computation infrastructure.Stay up to date with the latest trends and advancements in HPC and GPU technologies and advise senior leadership on procurement strategies for next-generation hardware and solutions

    Who You Are

    Required Qualifications

    Bachelor’s degree in computer science, Information Technology, or a related quantitative field.Relevant experience in HPCExtensive experience building and optimizing high-performance computing systems.Experience with networking and interconnect architectures commonly found in distributed systems. (e.g. Infiniband, Mellanox, 100GbE)Experience managing high-performance storage systems (e.g. Ceph, Gluster, Lustre, etc.)Knowledge of HPC system tools and software stacks, such as job schedulers (e.g. Slurm), performance monitoring, and system management.Excellent problem-solving skills and the ability to troubleshoot complex system issues.Strong communication skills and the ability to work collaboratively with both technical and non-technical team members.

    Preferred Skills

    Strong understanding of modern GPU architectures and programming frameworks like CUDA or OpenCL.Experience working with NVIDIA Data Center-class GPUs (A100, H100, etc)Experience with deep learning frameworks like TensorFlow or PyTorch is a plus.Familiarity with foundation models and their computational requirements.Familiarity with NVIDIA Enterprise AI software platform

The salary range for Cambridge, UK:

Exact compensation may vary based on skills, experience, and location.

- Please click here to read the Altos Labs EU and UK Applicant Privacy Notice ()
- This Privacy Notice is not a contract, express or implied and it does not set terms or conditions of employment.

#-HYBRID

What We Want You To Know

We are a culture of collaboration and scientific excellence, and we believe in the values of diversity, inclusion and belonging to inspire innovation.

Altos Labs provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.