Enable job alerts via email!

System Administrator- High Performance Computing (HPC)

J&M Group

Ottawa

On-site

CAD 70,000 - 100,000

Full time

5 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

J&M Group is offering an exciting entry-level opportunity for a System Administrator specializing in High Performance Computing, located in Ottawa. The role involves managing and supporting the HPC environment while utilizing Linux platforms and HPC tools such as Slurm. If you are a problem-solver with a keen interest in technology, come and contribute to cutting-edge HPC initiatives with us.

Qualifications

  • Experience with Linux, Slurm, OpenHPC is critical.
  • Ability to manage and troubleshoot HPC systems required.
  • Experience in scripting and automation is beneficial.

Responsibilities

  • Manage day-to-day operations of the HPC environment.
  • Implement and manage system patches and upgrades.
  • Respond to and resolve complex technical issues.

Skills

Problem-solving
Analytical skills
Communication
Troubleshooting
Scripting (Bash)
Technical documentation
Time management

Education

In-depth experience in Linux platforms
Experience in HPC tools (Slurm, LSF, GridEngine)

Tools

KVM
Active Directory
Foreman

Job description

System Administrator- High Performance Computing (HPC)

Join to apply for the System Administrator- High Performance Computing (HPC) role at J&M Group

System Administrator- High Performance Computing (HPC)

1 day ago Be among the first 25 applicants

Join to apply for the System Administrator- High Performance Computing (HPC) role at J&M Group

  • Identify, diagnose, and resolve level two problems for users of the software and hardware, LAN and WAN, VPN, the Internet, mobile devices, and new computer technology; communicate solutions to end-users.
  • Respond to more complex issues (second line support) escalated by the first line support using problem-solving skills and analysis to identify root causes of issues, determine course of action and propose creative solutions.
  • Manage day-day operations and support of the HPC environment (Linux).
  • Take ownership of capacity, availability and performance of the HPC cluster(s).
  • Support end users in the submission and management of jobs based on Slurm and OpenHPC.
  • Migrate existing nodes as required to Linux.
  • Implement and manage a system based on Foreman or similar to manage patching and oversee cluster management.
  • Implement patches and upgrades to Linux, Slurm and OpenHPC as required.
  • Install new servers and storage, build new clusters, configure and manage Linux distributions, hypervisors (KVM) and tooling.
  • Automate where possible to increase efficiency of operations.
  • Execute upon firewall access requests to the environment.
  • Escalate priority support issues to senior staff and / or other corporate technology groups
  • Collect and document all relevant information prior to escalation to allow senior staff to operate efficiently
  • Document, track and monitor problems to ensure timely resolution.
  • Assist in tracking helpdesk calls pertaining to application, networking, and systems problems and issues.
  • Assign username, password and access right permissions for multiple proprietary applications, as well as client software.
  • Identity Management and multifactor authentication with integration between Active Directory and Linux platforms.
  • Perform hardware & software audits.
  • Product research and evaluation.
  • Provide emergency support on incidents as required.
  • Perform occasional after-hours maintenance.
  • Incident on-call rotation as required.
  • Day-to-day operational support.

Job Description

Main Responsibilities

  • Identify, diagnose, and resolve level two problems for users of the software and hardware, LAN and WAN, VPN, the Internet, mobile devices, and new computer technology; communicate solutions to end-users.
  • Respond to more complex issues (second line support) escalated by the first line support using problem-solving skills and analysis to identify root causes of issues, determine course of action and propose creative solutions.
  • Manage day-day operations and support of the HPC environment (Linux).
  • Take ownership of capacity, availability and performance of the HPC cluster(s).
  • Support end users in the submission and management of jobs based on Slurm and OpenHPC.
  • Migrate existing nodes as required to Linux.
  • Implement and manage a system based on Foreman or similar to manage patching and oversee cluster management.
  • Implement patches and upgrades to Linux, Slurm and OpenHPC as required.
  • Install new servers and storage, build new clusters, configure and manage Linux distributions, hypervisors (KVM) and tooling.
  • Automate where possible to increase efficiency of operations.
  • Execute upon firewall access requests to the environment.
  • Escalate priority support issues to senior staff and / or other corporate technology groups
  • Collect and document all relevant information prior to escalation to allow senior staff to operate efficiently
  • Document, track and monitor problems to ensure timely resolution.
  • Assist in tracking helpdesk calls pertaining to application, networking, and systems problems and issues.
  • Assign username, password and access right permissions for multiple proprietary applications, as well as client software.
  • Identity Management and multifactor authentication with integration between Active Directory and Linux platforms.
  • Perform hardware & software audits.
  • Product research and evaluation.
  • Provide emergency support on incidents as required.
  • Perform occasional after-hours maintenance.
  • Incident on-call rotation as required.
  • Day-to-day operational support.

Specialized Skills, Knowledge & Abilities

  • In-depth and demonstrated experience in the installation and operation of Linux platforms in an Enterprise environment (Ubuntu / RedHat).
  • Experience in the use of KVM or other hypervisors.
  • Experience in HPC tools such as Slurm, LSF or GridEngine.
  • Demonstrated knowledge of HPC clusters and use cases.
  • Working technical knowledge of network systems.
  • Working technical knowledge of current systems software, protocols and standards including Active Directory.
  • Identity management using Microsoft Identity Manager and Azure AD Connect.
  • Solid understanding of the Windows based endpoints.
  • Solid scripting experience (e.g. Bash)
  • Excellent written and oral communication skills.
  • Excellent problem-solving skills.
  • Strong analytical and troubleshooting skills
  • Strong interpersonal and organizational skills.
  • Must be well organized and able to grasp system concepts and communicate their applications.
  • Must be capable of quickly learning new systems and associated software applications for proficient execution of tasks.
  • Ability to manage multiple demands with time related constraints in a fast-paced environment.
  • Prioritize and schedule work as necessary to maintain department standards and service level agreements
  • Ability to speak effectively before groups of internal employees, communicate technical information, create and deliver presentations and information sessions to both technical and nontechnical personnel.
  • Demonstrated experience in applying technical expertise and in-depth evaluation to solve complex problems in own area of expertise.
  • Ability to create and maintain documentation and training materials, including KB articles, for technical staff and end-user audiences.
  • Microsoft Windows experience is an asset.
  • Bilingualism (English / French) is an asset.

Seniority level

Seniority level

Entry level

Employment type

Employment type

Contract

Job function

Job function

Information Technology

IT Services and IT Consulting

Referrals increase your chances of interviewing at J&M Group by 2x

Get notified about new System Administrator jobs in Ottawa, Ontario, Canada .

Information Technology and Operations Administrator

System Administrator – VMware & Automation Specialist (32373)

High Performance Computing HPC Administrator (32447)

Identity and Access Management Specialist

ServiceNow Functional / Technical Consultant - Elevate Program 2025

ServiceNow Functional Consultant and Technical Consultant

ServiceNow Functional and Technical Senior Consultant

Technical Integration Specialist - cCure & Genetec Experience

Proposal Library Administrator - SharePoint & AI Integration

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

J-18808-Ljbffr

Create a job alert for this search
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

UNIX Systems Administrator

vTech Solution

null null

Remote

Remote

CAD 70,000 - 100,000

Full time

Today
Be an early applicant

Senior Systems Administrator - Cority - Remote

The Remote Job Journal

null null

Remote

Remote

CAD 80,000 - 110,000

Full time

Today
Be an early applicant

Cloud Systems Administrator

Telesat Canada

Ottawa null

On-site

On-site

CAD 75,000 - 100,000

Full time

4 days ago
Be an early applicant

System Administrator

J.L. Richards & Associates Limited

Ottawa null

On-site

On-site

CAD 80,000 - 100,000

Full time

4 days ago
Be an early applicant

HPC System Administrator

Telesat Corporation

Ottawa null

On-site

On-site

CAD 80,000 - 100,000

Full time

4 days ago
Be an early applicant

ADMINISTRATEUR SYSTÈMES SPÉCIALISTE INTUNE Montréal (télétravail) 2024-12-02

Gravity Conseil

Montreal null

Remote

Remote

CAD 75,000 - 95,000

Full time

8 days ago

Salesforce System Administrator

Joni and Friends

Mission null

Remote

Remote

USD 78,000 - 85,000

Full time

12 days ago

Senior Systems Administrator - Remote British Columbia

Emil Anderson Construction

Chilliwack null

Remote

Remote

CAD 70,000 - 100,000

Full time

15 days ago

System Administrator & Analyst, HealthCare Sales Force

Reckitt Benckiser LLC

Ottawa null

On-site

On-site

CAD 70,000 - 90,000

Full time

4 days ago
Be an early applicant