Enable job alerts via email!

HPC Specialist - Lustre, Cray, Linux, Infiniband

ComTech Europe Limited

United Kingdom

Remote

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Job summary

A technology consulting firm is seeking an HPC specialist with extensive experience in Lustre file system deployment for a large-scale project. The role involves designing file architectures, automating processes with scripting, and managing high-speed interconnects. This position is fully remote and starts immediately, initially for three months with potential for extension.

Qualifications

  • In-depth understanding and deployment of Lustre file systems.
  • Proficient in scripting for automation.
  • Experience with managing HPC file infrastructures.

Responsibilities

  • Design and deploy Lustre architectures for file and directory management.
  • Automate mass updates and manage data movement.
  • Manage high-speed interconnects and design performance tuning.

Skills

Redhat Openshift
Lustre
NFS
File systems
Cray Storage Systems
Automation & Scripting
Ansible
Python
Perl
Shell
Linux
ACLs
SELinux
Nodemap
Datafabric
InfiniBand
RDMA
Job description

My client are looking for an experienced HPC specialist for a large scale HPC project. The consultant will require an indepth understanding and have deployment experience of Lustre file system Architectures.

The consultant will require experience in the following areas.

Lustre File System Design & Optimization:

  • In-depth understanding and deployment of Lustre architectures. Able to work with the end-customer in designing and deploying file and directories including applying ACLs node maps.

Automation & Scripting:

  • Proficient in Scripting for automating mass updates, permission changes, and data movement across large-scale data lakes.
  • Advanced Scripting, and large-scale data management: Ansible, Puppet, Salt, Chef, with experience of the following languages: Python, Perl, Shell
  • Databases: MySQL, PostgreSQL, MongoDB

Petabyte-Scale Storage Deployment:

  • Hands-on experience with managing HPC file infrastructures. Knowledge of Cray storage systems would be an advantage.
  • Familiarity ZFS, and parallel file system integration.

Linux Permissions & Nodemaps:

  • Strong skills in Linux ACLs, SELinux, and nodemap configuration for secure and efficient access control.
  • Experience in multi-user, multi-tenant environments with complex permission hierarchies.

Data Fabric & Interconnects:

  • Working knowledge of high-speed interconnects such as InfiniBand, including topology design and performance tuning.
  • Understanding of RDMA, fabric management tools, and integration with storage and compute nodes.

Skills required:

  • Redhat Openshift
  • Lustre, NFS
  • File systems
  • Cray Storage Systems
  • Automation & Scripting
  • Ansible, Python, Perl, Shell
  • Linux
  • ACLs, SELinux, Nodemap
  • Datafabric
  • InfiniBand, RDMA

Location:

The role can be completed 100% remotely.

Start:

The role is to start Immediately and the contract will initially be for 3 months with potential to extend.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.