Enable job alerts via email!
Boost your interview chances
A leading university in Singapore seeks an HPC Systems Administrator to design and maintain high-performance computing clusters. This role includes managing resource scheduling and supporting scientific computing workflows, requiring strong Linux skills and a degree in a related field.
Responsibilities:
Design, deploy, and maintain high-performance computing clusters.
Manage resource scheduling systems (e.g., SLURM - Simple Linux Utility for Resource Management, PBS - Portable Batch System, etc.).
Support scientific computing workflows and assist users in optimizing applications.
Monitor system performance and troubleshoot hardware/software issues.
Provide systems administration / Management systems availability statistics, IT support, systems hardening, systems patching, systems onboarding and decommissioning and other systems related support services.
Collaborate with researchers and technical teams on computing needs and solutions.
Ensure security, updates, and compliance across GPU (Graphics Processing Unit) cluster infrastructure.
Ability to handle emergency situations and proactively resolve any issues.
Support any other AI Mega Center tasks as instructed by the supervisor.
Requirements:
Minimally a Bachelor’s degree in Computer Science, Engineering, or a related field.
At least 1 year of experience in HPC systems administration or engineering.
Proficiency with Linux system administration and shell scripting.
Experience with parallel computing frameworks and HPC workload managers is advantageous.
Familiarity with networking, storage systems, and performance tuning.
Experience in managing parallel file systems (Lustre, GPFS (General Parallel File System), BeeGFS - Fraunhofer Parallel Filesystem).
Good knowledge of Remote Direct Memory Access-based interconnect (InfiniBand, RoCE - Remote Direct Memory Access over Converged Ethernet).
Experience with containerization technologies (e.g., Docker, Singularity) and virtual machine.
Knowledge of cloud-based HPC is a plus.
Strong communication and collaboration skills.