Job Search and Career Advice Platform

Enable job alerts via email!

HPC Systems Administrator - Performance, Security & Ops

A*STAR RESEARCH ENTITIES

Singapore

On-site

SGD 60,000 - 80,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading research organization in Singapore is seeking an HPC System Administrator to manage and ensure the stability of HPC systems. The role involves system operations, user and job management, incident response, and security compliance. Candidates should have a degree in Computer Science or related fields, with a minimum of 2 years of relevant Linux administration experience. Proficiency in scripting and cluster management tools is preferable. Strong troubleshooting and communication skills are essential.

Qualifications

  • Minimum 2 years of experience in Linux system administration, preferably in HPC environments.
  • Basic understanding of RDMA interconnects (Infiniband, RoCE) and parallel file systems (Lustre, GPFS, BeeGFS).
  • Understanding of basic network protocols like DHCP, DNS, TFTP, SMTP.

Responsibilities

  • Administer HPC compute nodes, storage systems, and internal networks.
  • Monitor system health using tools like Grafana, Prometheus, and custom scripts.
  • Respond to system alerts and user-reported issues.

Skills

Linux system administration
Scripting (Python, Bash)
Cluster management tools (xCAT, BCM, HPCM)
Job schedulers (PBS Pro, Slurm)
Troubleshooting skills
Communication skills

Education

Degree in Computer Science, Engineering, IT or related field
Job description
A leading research organization in Singapore is seeking an HPC System Administrator to manage and ensure the stability of HPC systems. The role involves system operations, user and job management, incident response, and security compliance. Candidates should have a degree in Computer Science or related fields, with a minimum of 2 years of relevant Linux administration experience. Proficiency in scripting and cluster management tools is preferable. Strong troubleshooting and communication skills are essential.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.