Enable job alerts via email!

Staff Engineer - Cloud Development

Graphcore

Bristol

On-site

GBP 70,000 - 90,000

Full time

2 days ago
Be an early applicant

Job summary

An innovative AI technology company in Bristol is seeking a Staff Engineer for its Cloud Development Team. The role involves developing and deploying end-user services, requiring strong cloud infrastructure experience and skills in Infrastructure-as-Code. Ideal candidates will have proven IT and software engineering experience, particularly in Linux environments. The position offers a competitive salary, flexible working, and a commitment to diversity and inclusion.

Benefits

Private medical insurance
Pension contributions
Generous leave
Flexible working

Qualifications

  • Proven software engineering or IT experience.
  • Experience working within AGILE and SCRUM frameworks.
  • Linux system administration experience (Ubuntu, RHEL).

Responsibilities

  • Develop and operate end-user services on private clouds.
  • Build automation for metrics collection and analysis.
  • Maintain and operate AI system fleets in private clouds.

Skills

Cloud infrastructure
Infrastructure-as-Code
Networking
Storage systems
Linux scripting
End-user support
Communication

Education

Bachelor's degree or equivalent in a relevant field

Tools

Terraform
Ansible
Docker
Git
Grafana
Prometheus

Job description

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.

It develops hardware, software, and systems infrastructure to enable the next generation of AI breakthroughs and promote widespread AI adoption across industries.

As part of the SoftBank Group, Graphcore belongs to an elite family of companies responsible for transformative technologies. Their shared vision is to enable Artificial Super Intelligence and make its benefits accessible to all.

Graphcore’s teams are diverse, comprising AI research specialists, silicon designers, software engineers, and systems architects, fostering a culture of continuous learning and innovation.

Job Summary

We seek a Staff Engineer for our Cloud Development Team to develop and deploy services. Collaborating with Platform Engineering, Data Centre Operations, and Product Development teams, you will deploy services on our advanced AI systems, including in-house hardware and off-the-shelf servers, switches, and storage solutions. This hands-on role requires a strong background in cloud infrastructure, Infrastructure-as-Code deployment, networking, and storage systems. Experience in IT, data centres, cloud providers, or orchestration/cloud services is desirable.

The Platform Engineering Team at Graphcore

We integrate Graphcore products into large-scale AI solutions for internal and external customers, often working with pre-release hardware and software, requiring comfort with unproven components.

Responsibilities and Duties
  • Develop and operate end-user services on private clouds, supporting internal users and translating requirements into deployed services.
  • Build automation for metrics collection and analysis to identify and report issues, working with users and engineering teams.
  • Maintain and operate AI system fleets in private clouds in collaboration with Data Centre Operations Engineers.
  • Configure and test new hardware and systems using Continuous Deployment and Infrastructure-as-Code in data centres.
  • Integrate third-party hardware solutions into our Cloud Reference Design in partnership with vendors.
Skills and Experience
  • Bachelor's degree or equivalent in a relevant field.
  • Proven software engineering or IT experience with a track record of delivering results.
  • Experience working within AGILE and SCRUM frameworks.
  • Strong Linux scripting skills (bash, python, awk, sed).
  • Linux system administration experience (Ubuntu, RHEL).
  • Experience with version control systems (preferably Git).
  • Familiarity with CI/CD pipelines (GitLab, GitHub).
  • Understanding of cloud service technologies (APIs, virtualization, networks, storage, resource management).
  • Experience with Infrastructure-as-Code tools (Terraform, Ansible, Packer).
  • Experience with container management (Docker).
  • Knowledge of monitoring and observability tools (Grafana, Prometheus, ElasticSearch, Loki).
  • Good communication and end-user support skills.
  • Ability to work independently on critical infrastructure with minimal oversight.
Desirable Skills
  • Experience with OpenStack cloud platforms.
  • Managing production Kubernetes clusters.
  • Python3 programming with classes and inheritance.

Graphcore offers a competitive salary, flexible working, generous leave, private medical insurance, pension contributions, and a commitment to diversity and inclusion. Note: Applicants must have the right to work in the UK; visa sponsorship is not available at this time.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.