Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
An established industry player is seeking a Senior HPC Systems Engineer to join a dynamic team supporting NASA's High Performance Computing initiatives. This role involves providing Supercomputing Systems Administration, enhancing batch scheduling systems, and ensuring optimal performance of HPC resources. The ideal candidate will possess extensive experience in HPC environments, strong scripting abilities, and excellent communication skills. Join a forward-thinking company that values innovation and excellence, and play a key role in advancing high-performance computing solutions that make a significant impact in the field.
RedLine Performance Solutions (RedLine) has been in the HPC solutions engineering services business for 25 years and is consistently determined to keep the "bar of excellence" quite high for new hires. This enables RedLine to accomplish what other firms cannot and promotes a high level of staff retention. We offer services ranging from full life cycle HPC systems engineering to remote managed services to HPC program analysis.
We are seeking a Senior HPC Systems Engineer to join our NASA NACS High Performance Computing team at NASA's Ames Research Center in Mountain View, CA. This role primarily provides Supercomputing Systems Administration support for our NASA NACS High Performance Computing (HPC) contract.
U.S. citizenship and the ability to obtain a Public Trust security clearance are mandatory requirements for this position. This position can be remote but will work Pacific time zone business hours. Travel to customer site will be required 2-3 times a year.
An individual at this skill level should have demonstrated extensive experience working with common HPC batch schedulers e.g. (PBS, Slurm, or Moab/Torque) while contributing to the support of users of HPC resources on the various issues they might have getting applications to efficiently execute. This individual should demonstrate experience installing, maintaining, and upgrading HPC systems. The individual, along with the entire HPC team, will be engaged in the day-to-day operations and support of the HPC resources. Activities may include system patching, operating system upgrades, deploying new systems, writing scripts, and troubleshooting system issues on the HPC system. The ability to interact with users to determine symptoms, and then reproduce their issues to isolate root cause of failure is a critical skill for this position. There will also be activities in testing, benchmarking, user tool scripting, and analyzing trouble tickets to find patterns indicating system or user education issues.
To learn more about RedLine, please visit our website at www.RedLinePerf.com