Enable job alerts via email!

Principal Research Infrastructure Engineer

King's College London

London

Hybrid

GBP 64,000 - 74,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading university is seeking a Principal Infrastructure Engineer to support innovative AI projects. The role involves deploying and maintaining large-scale infrastructure, mentoring junior staff, and collaborating with industry partners. The position offers technical freedom and a chance to contribute to cutting-edge research in a supportive environment.

Benefits

Salary £64,139 - £73,529
High flex mostly remote working
1 day every 2 weeks for personal development
Conference attendance
30 days annual leave

Qualifications

  • Experience deploying and maintaining large-scale infrastructure.
  • Ability to mentor junior staff and conduct code reviews.
  • Strong knowledge of security and Linux systems.

Responsibilities

  • Support AI/ML ops infrastructure services and scale compute.
  • Collaborate with a team to deliver infrastructure as code.
  • Work with leading industry partners on AI projects.

Skills

Deployment and maintenance of infrastructure
Software development
Network diagnosis
Mentoring junior staff
Cloud infrastructure
Monitoring and metrics platforms
Security fundamentals
Autonomy in engineering teams

Tools

Ansible
Puppet
Terraform
OpenTofu
Kubernetes
Linux

Job description

About us

King's e-Research department supports cutting edge computational and data intensive research across all disciplines at the College. We provide high performance compute, private and public cloud infrastructure and trusted research environments as the core building blocks of modern, data driven research. Alongside these infrastructure services e-Research provides Research Software Engineering, Infrastructure Engineering and Data Governance expertise to individual research projects.

About the role

We are expanding our Principal Infrastructure Engineering team to support two exciting new projects (in addition to our existing services):
  • Pharos AI , a £43m (£18.9m DSIT, £24m partners) grant to build an AI development platform to unlock the value of large multi-modal cancer datasets hosted in a pair of biobank secure data environments operated by Guy's and Thomas' and Bart's NHS Trusts. This includes extensive support from and collaboration with leading edge industry partners e.g. AI precision medicine and drug discovery.
  • The King's AI+ strategic investment recruiting 20 AI focused fellows and adding £2m of next-generation GPU capacity to King's Computational Research Engineering and Technology Environment (CREATE).
Work for these projects will be shared across a team of four principal infrastructure engineers to deliver AI/ML ops infrastructure services, scale out compute to national AI supercomputers and public cloud providers, produce quality portable and open sourced infrastructure as code.

Our Principal Infrastructure Engineers work collaboratively with large amounts of technical freedom and decision making autonomy. We build almost exclusively with FOSS and wish to put more of our work back into the community over time. You can expect to work with the following technologies: Apache, Bacula, Ceph (CephFS, RADOSGW, RBD), Discourse, Flask, Git, GitLab, GLPI, Grafana, Laravel, Let's Encrypt, Linux (primarily Ubuntu), mkdocs, Nginx, OpenStack, OpenSSL, OpenSSH, OpenTofu, OpenOnDemand, OpenVPN, ProxMox, Python, Puppet, SLURM, Spack, Squid, Trivy, VSCode, Wireguard, ZFS.

To get a feel of our work to-date please take a look at our docs , GitHub and watch this presentation at CIUK 2023

This is a full-time (35 hours per week) position, offered on a fixed-term contract, currently funded until 31/5/2027, but it is planned to convert to permanent.

We also anticipate that a second, similar position will become available shortly, subject to funding approval. Candidates may be considered for this additional role if funding is confirmed.

About you

To be successful in this role, we are looking for candidates to have the following skills and experience:

Essential criteria

1. Demonstrable ability to deploy and maintain large-scale compute, storage and/or networking infrastructure through code (e.g. Ansible, Puppet, Terraform, OpenTofu)

2. Demonstrable ability to develop software with experience as the primary developer of projects with a large modular codebase, ideally dealing with issues such as concurrency, caching and performance scaling

3. Demonstrable ability to diagnose network and operating system level issues with tools such as strace, tcpdump, etc

4. Demonstrable ability to mentor and train more junior technical staff including review of software and infrastructure project code

5. Demonstrable ability to deploy and maintain public or private cloud infrastructure, and high performance compute clusters with experience of stability and storage engineering in relation to these

6. Demonstrable ability to deploy and maintain monitoring and metrics platforms at scale

7. Strong knowledge of security fundamentals and practical experience of securing Linux systems and related infrastructure

8. Proven ability to work with a high degree of autonomy within a high performing engineering team, fostering a culture of transparent collaboration and building technical consensus where necessary

Desirable criteria

1. Experience profiling and optimising AI/ML workloads

2. Experience deploying, configuring and maintaining Kubernetes clusters

3. Experience developing applications for and deployed onto Kubernetes clusters

4. Performance profiling of compute and/or IO intensive workloads

5. Ability to read, understand and troubleshoot opensource software written in C

Downloading a copy of our Job Description

Full details of the role and the skills, knowledge and experience required can be found in the Job Description document, provided at the bottom of the next page after you click "Apply Now". This document will provide information of what criteria will be assessed at each stage of the recruitment process.

Further Information

Benefits:
  • KCL Grade 8 £64,139 - £73,529
  • High flex mostly remote working (typically between 1 - 8 days in the office per month depending on personal preference)
  • 1 day every 2 weeks dedicated to personal development on relevant technology of your choosing
  • Conference attendance (e.g. CERN storage week, FOSDEM Brussels, CIUK, AI UK)
  • 35 hour week
  • 30 days annual leave (plus Christmas closure)
We pride ourselves on being inclusive and welcoming. We embrace diversity and want everyone to feel that they belong and are connected to others in our community.

We are committed to working with our staff and unions on these and other issues, to continue to support our people and to develop a diverse and inclusive culture at King's.

We ask all candidates to submit a copy of their CV, and a supporting statement, detailing how they meet the essential criteria listed in the advert. If we receive a strong field of candidates, we may use the desirable criteria to choose our final shortlist, so please include your evidence against these where possible.

To find out how our managers will review your application, please take a look at our ' How we Recruit ' pages.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead Infrastructure Engineer

IRIS Software Group

Slough

On-site

GBP 60,000 - 85,000

Today
Be an early applicant

Lead Infrastructure Engineer - DV Clearance

ZipRecruiter

Basingstoke

On-site

GBP 65,000 - 70,000

Yesterday
Be an early applicant

Lead Infrastructure Engineer - DV Cleared - onsite - Basingstoke

ZipRecruiter

Basingstoke

On-site

GBP 65,000 - 70,000

Yesterday
Be an early applicant

Lead Infrastructure Engineer - Web Proxy

TN United Kingdom

London

On-site

GBP 60,000 - 100,000

15 days ago

Lead Infrastructure Engineer

JR United Kingdom

Reading

On-site

GBP 70,000 - 90,000

4 days ago
Be an early applicant

Lead Infrastructure Engineer - DV Clearance

TieTalent

Basingstoke

On-site

GBP 65,000 - 70,000

4 days ago
Be an early applicant

Lead Infrastructure Engineer - DV Cleared - onsite - Basingstoke

JR United Kingdom

Basingstoke

On-site

GBP 70,000 - 80,000

4 days ago
Be an early applicant

Lead Infrastructure Engineer (Hardware and Platforms)

Sopra Steria - UK

Hemel Hempstead

On-site

GBP 62,000 - 74,000

5 days ago
Be an early applicant

Lead Infrastructure Engineer (Hardware and Platforms)

Sopra Steria Group

Hemel Hempstead

On-site

GBP 62,000 - 74,000

5 days ago
Be an early applicant