Enable job alerts via email!

Site Reliability Engineer – Field Operations

Tbwa Chiat/Day Inc

London

On-site

GBP 100,000 - 125,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Site Reliability Engineer for Field Operations in London. This role involves designing and implementing tailored installations of a cutting-edge AI platform while ensuring maximum system uptime and availability. You will be responsible for establishing monitoring systems, solving complex issues, and leading automation efforts to enhance deployment cycles. Join a dynamic team that values innovation and collaboration, where your contributions will directly impact the efficiency and reliability of enterprise AI applications. If you are passionate about technology and eager to tackle challenges in a fast-paced environment, this opportunity is perfect for you.

Qualifications

  • Bachelor's degree in a relevant STEM field is required.
  • Experience in managing Kubernetes-based infrastructure in public clouds.

Responsibilities

  • Design and implement customized installations of the C3 AI Platform.
  • Maximize system uptime and ensure performance SLAs.

Skills

Kubernetes
Linux Operating Systems
Networking
Database concepts
Problem-solving
Communication skills
Ruby
Bash
Python

Education

Bachelor’s degree in STEM

Tools

AWS
GCP
Azure
Terraform
Ansible
Puppet

Job description

Site Reliability Engineer – Field Operations

London, UK

C3 AI (NYSE: AI), is the Enterprise AI application software company. C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing, deploying, and operating enterprise AI applications, C3 AI applications, a portfolio of industry-specific SaaS enterprise AI applications that enable the digital transformation of organizations globally, and C3 Generative AI, a suite of domain-specific generative AI offerings for the enterprise. Learn more at: C3 AI.

Responsibilities:

  1. Work with customers to design and implement customized installations of the C3 AI Platform that meet unique access and security requirements.
  2. Maximize system uptime and availability, ensuring functional and performance SLAs.
  3. Establish end-to-end monitoring and alerting on all critical aspects.
  4. Solve complex problems for critical services and build automation to prevent problem recurrence.
  5. Initiate and lead scripting and automation to streamline system updates and upgrades.
  6. Set up critical infrastructure, tools, and framework to streamline the deployment cycle.
  7. Work cross-functionally with Services and Engineering teams.

Qualifications:

  1. Bachelor’s degree in a Science, Technology, Engineering or Mathematics (STEM), or comparable area of study.
  2. Demonstrated experience in deploying, managing, and operating scalable and fault-tolerant Kubernetes-based infrastructure in AWS, GCP, and other public clouds.
  3. Expertise in Linux Operating Systems, Networking, and Database concepts.
  4. Expertise in cloud providers, such as Amazon Web Services, Azure, and GCP.
  5. Experience with Infrastructure-as-Code configurations such as Terraform, Ansible, or Puppet.
  6. Experience in Ruby, Bash, or Python to automate and monitor systems.
  7. Excellent problem-solving, critical thinking, and communication skills.
  8. Experience supporting as a DevOps or sys admin for commercial SaaS solutions. Customer-facing experience is a plus.

C3 AI provides a competitive compensation package and excellent benefits.

C3 AI is proud to be an Equal Opportunity and Affirmative Action Employer. We do not discriminate on the basis of any legally protected characteristics, including disabled and veteran status.

Apply for this job
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead Platform Architect (m/f/d)-AI

TN United Kingdom

Greater London

Remote

GBP 70,000 - 110,000

3 days ago
Be an early applicant

Site Reliability Engineer – Elite Global Hedge Fund – World-Class Compensation Package

Mondrian Alpha

Greater London

On-site

GBP 60,000 - 120,000

7 days ago
Be an early applicant

Site Reliability Engineer

Orgvue Limited

London

Hybrid

GBP 70,000 - 110,000

-1 days ago
Be an early applicant

Staff Site Reliability Engineer, Infrastructure Security Denmark; France; Germany; Ireland; Lon[...]

MongoDB

London

On-site

GBP 70,000 - 110,000

2 days ago
Be an early applicant

Site Reliability Engineer - Elite Global Hedge Fund - World-Class Compensation Package

ZipRecruiter

London

On-site

GBP 70,000 - 110,000

4 days ago
Be an early applicant

Production Reliability Engineer, Direct Trading

Jump Trading Group

London

On-site

GBP 50,000 - 120,000

10 days ago

Principal Site Reliability Engineer

JR United Kingdom

Staines-upon-Thames

On-site

GBP 95,000 - 120,000

Yesterday
Be an early applicant

Platform Engineer Observability

Sahomelocator

London

On-site

GBP 100,000 - 140,000

2 days ago
Be an early applicant

Consultant Forensic Psychiatrist-Tower Hamlets_Woodberry Ward

NHS

London

On-site

GBP 105,000 - 140,000

4 days ago
Be an early applicant