Enable job alerts via email!

Site Reliability Engineer

C3 AI

Camden Town

On-site

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Job summary

A leading AI solutions provider in Camden Town is seeking a DevOps Engineer to maximize system uptime and ensure performance SLAs. You will establish monitoring, solve complex problems, and lead automation efforts across cloud platforms. Strong experience with Linux, Kubernetes, and services like AWS or GCP is essential. An ideal candidate has a degree in Computer Science and is proficient in automation tools like Ansible or Puppet.

Qualifications

  • Demonstrated experience in deploying, managing, and operating scalable and fault-tolerant Linux/Kubernetes infrastructure.
  • Experience supporting as a DevOps or sys admin for SaaS solutions.
  • Experience with configuration management systems like Ansible or Puppet.

Responsibilities

  • Maximize system uptime and availability, ensuring SLAs.
  • Establish end-to-end monitoring on critical aspects.
  • Influence new designs and methods for platform support.
  • Lead automation of system updates and upgrades.
  • Set up critical infrastructure to streamline deployment.

Skills

Problem-solving
Critical thinking
Automation
Communication skills
Linux
Networking
Database concepts
DevOps

Education

BS or MS in Computer Science or related field

Tools

Linux
Kubernetes
AWS
GCP
Ansible
Puppet
Ruby
Python
Cassandra
Job description
  • Maximize system uptime and availability, ensuring functional and performance SLAs.
  • Establish end-to-end monitoring and alerting on all critical aspects.
  • Solve complex problems for critical services and build automation to prevent problem recurrence.
  • Influence and create new designs, architectures, standards, and methods for supporting the platform.
  • Initiate and lead scripting and automation to streamline system updates and upgrades.
  • Set up critical infrastructure, tools, and framework to streamline the deployment cycle.
    Demonstrated experience in deploying, managing, and operating scalable and fault-tolerant Linux/Kubernetes/JVM-based infrastructure in AWS, GCP, and other public clouds.
  • Expertise in Linux Operating Systems, Networking, and Database concepts.
  • Experience with Cassandra (or another NoSQL alternative).
  • Expertise in cloud providers, such as Amazon Web Services, Azure, and GCP.
  • Experience with configuration management systems such as Ansible or Puppet.
  • Experience in Ruby or Python; to automate and monitor systems.
  • Excellent problem-solving, critical thinking, and communication skills.
  • Experience supporting as a DevOps or sys admin for commercial SaaS solutions.
  • BS or MS in Computer Science, related field, or equivalent professional experience.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.