Enable job alerts via email!

HPC System Administrator

Lenovo

Montreal

On-site

CAD 90,000 - 100,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Lenovo is seeking an HPC System Administrator to join their Data Center team in Montreal. The role involves monitoring and managing data center infrastructure, troubleshooting hardware and software issues, and providing high-quality customer support. Candidates should have experience in HPC systems, strong problem-solving skills, and the ability to work collaboratively in a dynamic environment.

Benefits

Flexibility and working from home possibility
Stimulating office environment
Opportunities for professional growth

Qualifications

  • Experience in HPC system troubleshooting, monitoring, and support.
  • Fluency in English language.
  • Able to perform OS installation and upgrades with no supervision.

Responsibilities

  • Monitoring, maintaining, and managing the physical infrastructure of a data center.
  • Responding to alerts, performing preventative maintenance, and managing issues.
  • Installation, configuration, and support of services within the customer platform.

Skills

Storage Management
High-Performance Computing (HPC) Systems
Hardware Installations
Hardware Maintenance
Security Management
Customer Service
System Administration
Network administration

Tools

Red Hat
Confluent Platform

Job description

Join to apply for the HPC System Administrator role at Lenovo

Join to apply for the HPC System Administrator role at Lenovo

Get AI-powered advice on this job and more exclusive features.

Lenovo is the Number 1 Supercomputing provider in the world measured by Top 500 entries, and we keep going. Our Data Center team is dedicated to fostering an environment that encourages entrepreneurism and ownership - a workplace where your talents can be challenged, and your efforts recognized and rewarded.

As part of the Lenovo Professional Services team your responsibilities will include:

-Monitoring, maintaining, and managing the physical infrastructure of a data center, ensuring its smooth operation, reliability, and security.

-Monitoring power and cooling systems and network connectivity

-Hardware and system software debugging and troubleshooting.

-Addressing hardware and software issues

-Responding to alerts, performing preventative maintenance, rolling out and upgrading firmware versions, and managing any issues that may arise to minimize downtime and optimize data availability

-Become the customers’ Single Point of Contact (SPOC).

-Opening hardware trouble tickets against different vendors.

-Following up and reporting the progress on all issues. Respond to users and provide support to them on the daily operations of the cluster.

-Daily system administration tasks, including granting, deleting access, Investigate and correct defects in the cluster as reported, adhering to the service levels.

-Resolve errors through developing, testing and implementing changes to the system.

-Provide corrective and preventive maintenance, troubleshoot and isolate defects.

-Perform Software and firmware testing for any fixes, upgrades, security patch.

-Update the customers’ documentation when and as necessary to reflect the changes made to the system.

-Compile Monthly Reporting and take part in monthly customer Service Reviews where required.

Working directly with the customer you will be responsible for:

-The installation, configuration, and the support of services as required within the central customer Research Computing Services platform team.

-Work with vendors and customer Technology Office to design, implement and upgrade services using change management and revision

-Control processes to ensure that changes are properly tracked and available for audit when required.

-Analyze and troubleshoot system issues, defining, and resolving complex issues.

-Develop innovative solutions to continuously improve HPC and address any shortfalls in provision.

-Work closely with other customer staff, including Infrastructure Technology, Security and Governance teams.

-Understand the importance of security and seek specialist security advice to secure systems.

-Maintain a knowledge of technical developments, tools, and ideas in HPC, attending seminars, conferences, technical briefings, and other community events.

-Work flexibly as a part of the customer Platform Team, supporting the group’s activities and undertaking individual projects.

-Write and maintain documentation on system design and management processes to ensure knowledge is accessible and disseminated appropriately within the customer team.

-Deliver a high-quality service through a collaborative approach and outstanding analytical skills.

-Take an active part in meetings, representing the customer, and facilitating collaboration between partners.

-Assist customer researchers to utilize the HPC resource, providing subject matter expertise support to the Customer Research Computing Analysts.

The role gives you a great deal of independence and opportunity to take the lead and advise. You will be expected to work effectively in providing remote technical services in the areas of HPC & AI platforms and solutions. Also, you will be responsible for implementing and supporting HPC solutions at customer sites, involving Server, Storage, Network, Power and Cooling, OS, and cluster management software.

The job responsibilities involve providing knowledge transfer, troubleshoot, resolve and advise on the mentioned infrastructure and technologies, continuously monitoring critical parameters like power usage, temperature, humidity, network performance, flows, pressure, network, and server health through dedicated monitoring systems, generating alerts for potential issues.

The role includes incident management by quickly identifying and resolving issues raised by alerts, including hardware failures, network disruptions, and power fluctuations.

The position will require you to build a solid customer relationship, so it is important that you possess customer interaction skills and ability to make technical decisions, to collaborate during projects with several organization verticals, partners, and customers and to develop training and knowledge base documentation.

Position Requirements:

-Experience in HPC system troubleshooting, monitoring, and support

-Experience in System administration

-Fluency in English language

-Able to perform OS installation and upgrades with no supervision

-Able to perform high level problem determination

-Customer service skills including written and oral communication with client

What Lenovo can offer you:

-An exciting job with great opportunities for success

-A good potential to grow both professionally and personally

-You will be a part of the Number 1 PC vendor in the world

-Stimulating office environment

-Flexibility and working from home possibility

At Lenovo we are proud to be an equal opportunity company. This vacancy certainly applies for people with disabilities, too.

Skills:

Storage Management

High-Performance Computing (HPC) Systems

Hardware Installations

Hardware Maintenance

Security Management

Security Standards

Customer Service

System Administration

Server Admin

Red Hat

Network administration

Confluent Platform

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Information Technology and Engineering
  • Industries
    Information Services

Referrals increase your chances of interviewing at Lenovo by 2x

Sign in to set job alerts for “System Administrator” roles.
System Administrator with Cloud Experience

Montreal, Quebec, Canada CA$90,000.00-CA$100,000.00 1 month ago

DevOps Systems Administrator (4+ Years) - Up to $250k CAD + Industry Leading Bonus - Elite FinTech Firm

Montreal, Quebec, Canada CA$110,000.00-CA$120,000.00 1 day ago

Senior Microsoft Workplace Administrator

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

HP NonStop Systems Administrator

Artech L.L.C.

Remote

CAD 80,000 - 125,000

Today
Be an early applicant

Systems Administrator

NTT DATA North America

Halifax

Remote

CAD 70,000 - 100,000

Today
Be an early applicant

ADMINISTRATEUR SYSTÈMES SPÉCIALISTE INTUNE Montréal (télétravail) 2024-12-02

Gravity Conseil

Montreal

Remote

CAD 65,000 - 95,000

30+ days ago

ADMINISTRATEUR SYSTÈMES SPÉCIALISTE INTUNE Montréal (télétravail) 2024-12-02

Gravity Conseil

Montreal

Remote

CAD 70,000 - 110,000

30+ days ago

Administrateur Système Azure AD

SII Canada

Montreal

Hybrid

CAD 90,000 - 105,000

5 days ago
Be an early applicant

SCCM System Administrator – Montreal, Canada

Axiom Technologies

Montreal

On-site

CAD 80,000 - 110,000

Today
Be an early applicant

Administrateur(trice) de systèmes Sénior | Senior System Administrator

Talan Group

Montreal

On-site

CAD 80,000 - 100,000

2 days ago
Be an early applicant

PEGA Systems Administrator or PEGA System Architect

Compunnel Inc.

Montreal

On-site

CAD 80,000 - 100,000

5 days ago
Be an early applicant

PEGA Systems Administrator or PEGA System Architect

ZipRecruiter

Montreal

On-site

CAD 90,000 - 115,000

4 days ago
Be an early applicant