Enable job alerts via email!

Infrastructure Engineer

Second Renaissance

California (MO)

On-site

USD 149,000 - 184,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading scientific institution is seeking an Infrastructure Engineer to enhance and optimize their Hybrid Cloud Infrastructure Platform. The ideal candidate will collaborate with researchers, manage cloud resources, and implement security protocols, driving forward the institute's ambition in AI biological foundation models. This role requires a strong background in distributed systems and excellent problem-solving abilities in a fast-paced environment.

Qualifications

  • Extensive experience with distributed systems including cloud platforms.
  • Proficient in bare-metal system provisioning and scripting languages.
  • Strong understanding of security best practices and collaboration skills.

Responsibilities

  • Oversee the operation and optimization of our private cloud GPU cluster.
  • Design and execute strategies for automating system configurations.
  • Develop security protocols and maintain comprehensive documentation.

Skills

Advanced Linux system administration
Scripting in Python
Cloud platforms (AWS, GCP, Azure)
Problem-solving
Collaboration

Education

Bachelor's degree in Computer Science

Tools

Ansible
Kubernetes
Nagios

Job description

About Arc Institute

The Arc Institute is a new scientific institution that conducts curiosity-driven basic science and technology development to understand and treat complex human diseases. Headquartered in Palo Alto, California, Arc is an independent research organization founded on the belief that many important research programs will be enabled by new institutional models. Arc operates in partnership with Stanford University, UCSF, and UC Berkeley.

While the prevailing university research model has yielded many tremendous successes, we believe in the importance of institutional experimentation as a way to make progress. These include:

- Funding: Arc will fully fund Core Investigator’s (PI’s) research groups, liberating scientists from the typical constraints of project-based external grants.

  • Technology: Biomedical research has become increasingly dependent on complex tooling. Arc Technology Centers develop, optimize and deploy rapidly advancing experimental and computational technologies in collaboration with Core Investigators.
  • Support: Arc aims to provide first-class support—operationally, financially and scientifically—that will enable scientists to pursue long-term high risk, high reward research that can meaningfully advance progress in disease cures, including neurodegeneration, cancer, and immune dysfunction.
  • Culture: We believe that culture matters enormously in science and that excellence is difficult to sustain. We aim to create a culture that is focused on scientific curiosity, a deep commitment to truth, broad ambition, and selfless collaboration.

    Arc has scaled to nearly 200 people to date. With $650M+ in committed funding and a state of the art new lab facility in Palo Alto, Arc will continue to grow quickly in the coming years.

    About the position

    We are seeking an Infrastructure Engineer to join our team. In this role, you will be responsible for designing and optimizing our Hybrid Cloud Infrastructure Platform across public, private, and on-premise datacenters. You will work closely with researchers, developers, and IT professionals to ensure the availability, reliability, and performance of our compute, networking, and storage. Your work will fuel the development of AI biological foundation models (i.e. Evo ; Arc’s recently expanded DNA foundation model), the Virtual Cell Initiative, and other cutting-edge bioinformatic projects in the context of Institute-wide efforts.

    About you

    - You lead with empathy. You know that successful systems are more about the user than the tool. You enjoy building relationships and credibility with your colleagues.
  • You enjoy solving problems. Any new project is an interesting puzzle. So is a tricky troubleshooting issue. You get satisfaction from helping someone get to resolution.
  • You’re curious. You like to keep track of the latest developments in your field, and to learn about the substance behind your employer’s mission.

    In this position you will

    - Oversee the operation and optimization of our private cloud GPU cluster, focusing on enhancing availability, performance, and user experience.
  • Design and execute strategies for automating system configurations efficiently and safely, ensuring minimal disruption to production.
  • Develop a unified compute capacity platform with fixed and autoscaling resources across private and public cloud resources.
  • Facilitate efficient, high-throughput, and seamless data transfer between instruments and compute environments.
  • Enable the continuous integration and deployment of long-running services and databases across our hybrid platform.
  • Elevate system reliability by achieving additional “nines” of availability.
  • Develop and maintain comprehensive security protocols, including network security measures, access controls, vulnerability assessments, and continuous monitoring, to protect infrastructure and data from potential threats and breaches.
  • Collaborate with scientists to assess their computational requirements and deliver tailored resources and support.
  • Create and maintain comprehensive documentation for system configurations, operational procedures, security policies, and end-user guidance through a well-organized Wiki.

    Requirements

    - Bachelor's degree in Computer Science, Information Technology, or a related field.
  • Extensive experience with distributed systems, including cloud platforms (AWS, GCP, or Azure) and/or HPC environments (Slurm, Kubernetes, Grid Engine, Torque, etc.).
  • Advanced Linux system administration skills, including performance tuning and troubleshooting
  • Proficiency in bare-metal system provisioning (Ansible, Puppet, Chef, Virtualization and/or Containerization).
  • Proven ability in scripting languages like Python, Bash, or Perl.
  • Familiarity with network protocols, storage systems, and high-speed interconnects (InfiniBand, RoCE).
  • Working knowledge with monitoring tools like Nagios, Prometheus/Grafana, or New Relic.
  • Experience developing and maintaining software that interacts with Nvidia GPUs, including drivers and diagnostic tools (CUDA, nvcc, nccl, etc.).
  • Strong understanding of security best practices with hands-on experience in implementing and maintaining security measures.
  • Excellent problem-solving skills and the ability to work under pressure.
  • Strong communication and collaboration skills.

    The base salary range for this position is $149,500-$184,000. These amounts reflect the range of base salary that the Institute reasonably would expect to pay a new hire or internal candidate for this position. The actual base compensation paid to any individual for this position may vary depending on factors such as experience, market conditions, education/training, skill level, and whether the compensation is internally equitable, and does not include bonuses, commissions, differential pay, other forms of compensation, or benefits. This position is also eligible to receive an annual discretionary bonus, with the amount dependent on individual and institute performance factors.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Infrastructure Engineer (Remote - California)

Jobgether

California

Remote

USD 132,000 - 280,000

12 days ago

[Hiring] Lead Infrastructure Engineer @Pallon.com

Pallon.com

Remote

USD 100,000 - 150,000

Yesterday
Be an early applicant

[Hiring] Infrastructure Engineer @Federato

Federato

Remote

USD 140,000 - 170,000

Yesterday
Be an early applicant

Senior Software Engineer, Infrastructure Remote - SF Bay Area, Hybrid - NYC

GlossGenius, Inc.

California

Remote

USD 165,000 - 200,000

30+ days ago

Blockchain Infrastructure Engineer

Storm2

Remote

USD 71,000 - 198,000

5 days ago
Be an early applicant

Senior Infrastructure Engineer

Overleaf Enterprise

Remote

USD 120,000 - 160,000

6 days ago
Be an early applicant

Network Infrastructure Engineer (Senior)

Quality Control Specialist - Pest Control

Remote

USD 126,000 - 228,000

7 days ago
Be an early applicant

Senior Infrastructure Engineer

Crunchbase

Nevada

Remote

USD 170,000 - 185,000

6 days ago
Be an early applicant

Senior Software Engineer - Infrastructure

ether.fi

Denver

Remote

USD 120,000 - 160,000

5 days ago
Be an early applicant