Enable job alerts via email!

Senior Linux Infrastructure Engineer (HPC)

The Voleon Group

United States

Remote

USD 170,000 - 205,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology company is seeking a Senior HPC Infrastructure Engineer to design, optimize, and maintain high-performance computing resources. The role involves managing complex Linux systems and collaborating with top-tier professionals in a dynamic environment. This position offers competitive compensation, an enriching workplace culture, and opportunities for innovation in the finance sector.

Benefits

Daily catered lunches
401(k) plan with a company match
20 days paid time off
Comprehensive benefits package

Qualifications

  • 5+ years of Linux systems expertise.
  • Hands-on HPC cluster administration experience.
  • Strong performance tuning and troubleshooting skills.

Responsibilities

  • Manage HPC clusters and resource optimization.
  • Diagnose and resolve complex Linux system issues.
  • Develop automation/scripts for HPC management.

Skills

Linux expertise
Scripting proficiency (Bash/Python)
Performance tuning
Networking fundamentals

Tools

SLURM
PBS
Ceph
Lustre
NFS

Job description

Voleon is a technology company that applies state-of-the-art AI and machine learning techniques to real-world problems in finance. For more than a decade, we have led our industry and worked at the frontier of applying AI/ML to investment management. We have become a multibillion-dollar asset manager, and we have ambitious goals for the future.

As a Senior HPC Infrastructure Engineer, you'll design, optimize, and maintain HPC clusters, storage, and systems. You’ll ensure high-performance computing resources remain efficient, secure, and reliable. The ideal candidate will have deep expertise in Linux environments, scripting, automation, and troubleshooting complex system issues. You will play a key role in maintaining system uptime, implementing security best practices, optimizing performance, and supporting a growing, dynamic team of software, research, and systems engineers.

Your Team

We look for brilliant people with a passion for solving problems through innovation and engineering fundamentals. You’ll work in a collaborative environment that encourages creative thinking and efficient implementation. We embrace experimentation. You’ll work alongside experienced engineers recruited from leading technology companies and universities. You and your team will collaborate closely with top machine learning researchers.

We seek to hire someone in the following locations: Sacramento, Berkeley, or New York City.

Your colleagues will include internationally recognized experts in artificial intelligence and machine learning research as well as highly experienced finance and technology professionals. The people who shape our company come from other backgrounds, including concert music performances, humanitarian aid, opera singing, sports writing, and BMX racing. You will be part of a team that loves to succeed together.

In addition to our enriching and collegial working environment, we offer highly competitive compensation and benefits packages, technology talks by our experts, a beautiful modern office, daily catered lunches, and more.


Responsibilities
  • Manage HPC clusters, scheduling, and resource optimization (SLURM, PBS).
  • Configure and optimize distributed filesystems (Ceph, Lustre, NFS).
  • Diagnose and resolve complex Linux system issues and bottlenecks.
  • Develop automation/scripts for HPC cluster management.
  • Perform performance tuning (CPU, memory, networking, storage).
  • Participate in an on-call rotation.
Requirements
  • 5+ years of Linux systems expertise (kernel tuning, troubleshooting).
  • Hands-on HPC cluster administration (SLURM, PBS).
  • Experience managing distributed filesystems (Ceph, Lustre, NFS).
  • Scripting proficiency (Bash/Python) for automation and management.
  • Strong performance tuning and troubleshooting skills.
  • Networking fundamentals (high-speed interconnects, TCP/IP tuning).
Preferred (Nice-to-Have)
  • Familiarity with containerization (Docker, Singularity).
  • Experience with hybrid/cloud HPC environments (OpenStack, AWS).
  • Exposure to DevOps automation tools (Terraform, CI/CD).
  • Background in scientific/research computing or engineering workloads.

Compensation

The base salary for this position is $170,000 to $205,000 in the location(s) of this posting. Individual salaries are determined through a variety of factors, including, but not limited to, education, experience, knowledge, skills, and geography. Base salary does not include other forms of total compensation, such as bonus compensation and other benefits. Our benefits package includes medical, dental, and vision coverage, life and AD&D insurance, 20 days of paid time off, 9 sick days, and a 401(k) plan with a company match.

“Friends of Voleon” Candidate Referral Program

If you have a great candidate in mind for this role and would like to have the potential to earn $15,000 if your referred candidate is successfully hired and employed by The Voleon Group, please use this form to submit your referral. For more details regarding eligibility, terms and conditions please make sure to review the Voleon Referral Bonus Program .

Equal Opportunity Employer

The Voleon Group is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Vaccination Requirement

The Voleon Group has implemented a policy requiring all employees who will be entering our worksite, including new hires, to be fully vaccinated with the COVID-19 vaccine. This policy also applies to remote employees, as such employees will be asked to visit our offices from time to time. To the extent permitted by applicable law, proof of vaccination will be required as a condition of employment. This policy is part of Voleon’s ongoing efforts to ensure the safety and well-being of our employees and community, and to support public health efforts.

#LI-JA1

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Linux Infrastructure Engineer (IaC)

The Voleon Group

Remote

USD 170,000 - 205,000

4 days ago
Be an early applicant

Sr Full Stack Engineer - Cloud Infrastructure

Jobot

Salt Lake City

Remote

USD 150,000 - 200,000

10 days ago

Sr Full Stack Engineer - Cloud Infrastructure

Jobot

Kansas City

Remote

USD 150,000 - 200,000

11 days ago

Sr Full Stack Engineer - Cloud Infrastructure

Jobot

Omaha

Remote

USD 150,000 - 200,000

11 days ago

Sr Full Stack Engineer - Cloud Infrastructure

Jobot

Phoenix

Remote

USD 150,000 - 200,000

11 days ago

Sr Full Stack Engineer - Cloud Infrastructure

Jobot

Henderson

Remote

USD 150,000 - 200,000

11 days ago

Sr Full Stack Engineer - Cloud Infrastructure

Jobot

El Paso

Remote

USD 150,000 - 200,000

11 days ago

Lead Infrastructure Engineer - Remote

BigCommerce

Austin

Remote

USD 110,000 - 180,000

30+ days ago