Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
An established industry player is seeking a skilled Infrastructure Engineer to manage NVIDIA GPU servers and cloud environments for innovative AI and HPC projects. This hybrid role requires expertise in Linux systems, server optimization, and collaboration with researchers and engineers. You will be responsible for ensuring high availability, implementing security protocols, and developing tools for HPC environments. This is a fantastic opportunity to work in a collaborative team, where your contributions will directly impact cutting-edge technology and machine learning initiatives. If you thrive in a dynamic environment and have a passion for innovation, this role is perfect for you.
What will I be doing:
Darktrace is seeking an experienced Infrastructure Engineer to manage, maintain, and optimize a dedicated NVIDIA GPU server and cloud environments for innovation projects. Responsibilities include setting up, configuring, and maintaining the servers and software stack. A successful candidate will work directly with Darktrace researchers and software engineers, ensuring optimal performance and availability for ongoing AI and HPC (high-performance computing) projects.
This is a hybrid role, with a compulsory attendance of 2 days a week in the Cambridge office.
This role focuses on maintaining and optimising the Linux operating system, file systems, and software stack (Cuda, PyTorch, Python etc) for machine learning projects as well as setting up and configuring NVIDIA HGX servers (installing and updating software, managing user access, and ensuring optimal performance) and cloud infrastructure for GPU compute projects (managing access and ensuring optimal performance). Additional responsibilities include:
What experience do I need:
We welcome applications from engineers with strong problem-solving and creative thinking skills as well as excellent communication and the ability to work in a collaborative team environment. You will be an independent thinker with a startup mindset. Technology-wise, you will have experience in system administration, preferably with a focus on HPC platforms, GPU-based servers, and machine learning software environment as well as a familiarity with AI and HPC provisioning and management, both on-premises and in the cloud. You will have experience with server virtualization technologies and containerization and well versed with the linux operating system. You'll also ideally have:
Benefits we offer: