Enable job alerts via email!

Senior HPC Engineer

ZipRecruiter

Cambridge

Hybrid

USD 60,000 - 100,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Scientific Computing Specialist to join their innovative team. This role focuses on delivering high-performance computing solutions that support scientific initiatives. You will work closely with scientists and IT professionals, driving architecture and execution of projects that establish best practices in IT infrastructure. The ideal candidate will possess strong Linux system administration skills and experience in solution architecture, with a passion for transforming IT services to enhance client growth. Join a dynamic team that values excellence, accountability, and continuous improvement in a hybrid work environment.

Benefits

Competitive salary and bonus package
Comprehensive health and wellness benefits
Company-provided Life and Long-Term Insurance
Company-sponsored 401(k) Plan
Continuing education benefit
Team-focused culture
Unlimited opportunity for advancement

Qualifications

  • 5+ years of HPC cluster administration experience required.
  • Strong command-line system administration skills are essential.
  • Experience with CloudOps and Infrastructure as Code (IaC) is preferred.

Responsibilities

  • Design and evolve HPC platforms and support customer workflows.
  • Collaborate with teams to deliver Compute at Scale services.
  • Document new and existing computational assets comprehensively.

Skills

HPC cluster administration
Solution Architecture
Cloud Infrastructure Deployment
Linux system administration
Scripting tools
Communication skills
Time-management skills
Attention to detail

Education

Bachelor's degree in Computer Science
Master's degree in Computer Science

Tools

SLURM
Grid Engine
Ansible
Terraform
Docker
Singularity

Job description

About Us
RCH Solutions is an established and rapidly growing global provider of computational, research, and data science expertise within Life Sciences and Healthcare. At RCH Solutions, our team rallies around a culture crafted for learning and achieving. We’re relentless in our pursuit for innovation and demanding of ourselves to deliver a ground-breaking computing experience for our clients, so that they can deliver life-saving science to humanity.


Core Values
At RCH, our Core Values are more than just words—they represent the threads that weave together the fabric of our culture. Used as a guide when interviewing new team members; as a barometer when evaluating our performance as individuals and teams, and even when deciding which customers to work with, RCH’s Values embody the behaviors upon which we measure our success and create a framework for our growth as people and professionals.


Our Core Values:

  1. Embrace Excellence: We strive for best in class delivery of innovation and service.
  2. Be Accountable: Integrity, ownership and accountability are non negotiables.
  3. Adventure Together: We are committed to fostering a culture that embraces continuous improvement.
  4. Succeed as a Team: We believe harnessing the power of a team drives outcomes not achievable by individuals.
  5. Boundaries and Balance: Work-life balance is a core facet of our culture.

If you share in our core values, then we encourage you to continue reading this posting as you may have found a great home for your career.


Job Description
RCH Solutions is seeking a Scientific Computing Specialist to work closely with customer stakeholders, scientists, and IT professionals to deliver Compute at Scale and support our customer's scientific initiatives. The objectives for this role center on developing, evolving, and administering HPC platforms along with support for Scientific applications, workflows, and other related infrastructure both on-prem and Cloud hosted. Our ideal candidate also has hands-on experience with Linux system administration as well as solution architecting and engineering (on-prem and cloud-based) and will be instrumental in transforming how IT computing services are leveraged to support our client's growth. This role will involve driving architecture, roadmaps, and execution of projects to establish and operate IT infrastructure best practices for customers.


Responsibilities include:

  • Full stack support - design and evolution of platforms, application administration, supporting customer workflows, profiling and performance tuning, monitoring and maintenance of scoped systems, platform and systems administration, troubleshooting hardware, software, and networking related issues, solution architecting and hands-on engineering (on-prem + Cloud), as well as documentation.
  • Collaborating with cross-discipline team members and customers to deliver HPC and peripheral Compute at Scale services.
  • Thorough understanding of related industry best practices.
  • Supporting internal and customer Architecture and Design efforts.
  • Supporting customers with their workflow pipelines (advisory and hands-on).
  • Comprehensively documenting new and existing computational assets.
  • Maintaining the flexibility to pivot as engagement scopes may evolve.
  • Support for AWS Cloud applications, migrations, and modernization.
  • CloudOps / IaC for ongoing platform management.
  • Setup and configuration of AWS Cloud infrastructure for new platform builds.
  • Ensuring system compliance with company security standards and applicable regulatory requirements.
  • Transition support for modernized services to operational teams.
  • Provide engineering level troubleshooting and services restoration for operational issues as they arise on supported platforms.
  • Provide training/mentorship for junior level team members.
  • Escalation point on multiple engagements to ensure resolution.

Essential Qualifications

  • A bachelor’s degree or master’s degree in Computer Science or related field.
  • 5+ years of experience administering HPC clusters and systems.
  • Experience with SLURM and Grid Engine scheduling software.
  • 5+ years of professional experience in Solution Architecture or Cloud Infrastructure Deployment and support.
  • 7+ years professional experience developing or administering compute solutions for Scientific / Research IT domains, Life Sciences being a plus.
  • Extensive command-line system administration experience.
  • User and group management.
  • Advanced knowledge of Active Directory, DNS, DHCP, LDAP, NFS, SMB.
  • Building applications from source code, installing, maintaining, and troubleshooting application-level Linux and scientific software in line with industry best practices.
  • Installation of Linux operating system and fine-tuning.
  • Familiarity with leveraging and maintaining Linux package management systems.
  • Intermediate OS level networking knowledge.
  • Experience using scripting tools, automation tools, and configuration management tools.
  • Ansible, Terraform, and Cloud Formation experience.
  • Experience administering and integrating Scientific / Research applications.
  • Strong time-management skills; able to complete projects in a timely manner, plan and prioritize tasks while keeping leadership and stakeholders updated regularly on status.
  • Excellent communication skills, including preparation of written documentation for IT colleagues and end users.
  • Proactive thinking skills to identify potential issues and solution options prior to incidents occurring.
  • Extreme attention to detail is needed to interface with multiple clients simultaneously.
  • Ability to understand and analyze complex technical problems and situations.
  • Candidates must be a passionate engineer with a strong vision and a desire to stay on top of trends in the Scientific Computing sector.
  • Ability to work independently or with a team.
  • Ability to take a project from start to finish with minimal supervision.
  • Candidates must not require sponsorship now or in the future.

Qualifications
RCH provides services and solutions for the unique challenges of Life Sciences advanced computing, and leverages teams with cross-functional IT skills to meet these challenges. The ideal candidates for this role will have experience working with cross-functional IT (Public Cloud skills being a plus) and sciences skillsets.

  • Experience with Python, R, or other related data science programming.
  • Experience with POSIT products (Package Manager, Connect, Workbench) either in an end-user or administrator capacity.
  • Experience working with databases and/or supporting.
  • Experience managing large amounts of data effectively.
  • Experience working with AI/ML technologies.
  • Experience with containerizing compute workload via Docker or Singularity.
  • Experience with Nvidia DGX systems.

Additional information
Great talent should benefit from a great work environment. If you join our team, you’ll have access to:

  • A competitive salary and bonus package based on experience.
  • Comprehensive health and wellness benefits, including Medical, Dental, and Vision Insurance.
  • Company-provided Life and Long-Term Insurance.
  • Company-sponsored 401(k) Plan.
  • Company-provided continuing education benefit.
  • Team-focused culture and unlimited opportunity for advancement.

**This is a hybrid role and the candidate will be required to be onsite in our facility in Boston, MA several days per week.
**Role is only open to applicants not needing sponsorship now or in the future, no third parties please.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.