Enable job alerts via email!

Principal Site Reliability Engineer, Datastores

Cisco ThousandEyes

San Francisco (CA)

On-site

USD 176,000 - 315,000

Full time

28 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Cisco ThousandEyes is seeking a Principal Site Reliability Engineer for their Datastores team. This role involves managing critical datastores, ensuring performance and availability, and leading automation efforts. Candidates should have strong coding skills in Python or Go, and experience with Infrastructure as Code tools like Terraform and Kubernetes.

Qualifications

  • Experience building and supporting mission-critical datastores.
  • Proficiency in coding with Python, Go, or similar languages.
  • Strong Infrastructure as Code skills with Terraform and Kubernetes.

Responsibilities

  • Lead automation efforts to improve operational excellence for datastores at scale.
  • Support and optimize mission-critical datastores, ensuring high availability and performance.
  • Design and implement scalable solutions using Python, Go, or similar languages.

Skills

Python
Go
Infrastructure as Code
Terraform
Kubernetes
Unix/Linux
Communication

Job description

Principal Site Reliability Engineer, Datastores

Join to apply for the Principal Site Reliability Engineer, Datastores role at Cisco ThousandEyes.

About Cisco ThousandEyes

Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network. Powered by AI and extensive telemetry data, it helps IT teams proactively detect, diagnose, and remediate issues before impacting end-user experiences.

About The Role

The Datastores team manages mission-critical datastores like ElasticSearch, Kafka, MongoDB, MySQL, etc. Responsibilities include ensuring availability, performance, change management, capacity planning, monitoring, and emergency response. As a Principal SRE, you will innovate, provide technical vision, and help build reliable, scalable, and highly available datastores across a multi-region platform. You will collaborate with leaders, design architectures, and mentor the engineering team.

What You’ll Do
  • Lead automation efforts to improve operational excellence for datastores at scale.
  • Support and optimize mission-critical datastores, ensuring high availability and performance.
  • Design and implement scalable solutions, primarily using Python, Go, or similar languages.
  • Leverage Infrastructure as Code tools like Terraform and Kubernetes.
  • Utilize cloud services, preferably AWS, effectively in platform architecture.
  • Maintain deep knowledge of Unix/Linux systems and protocols.
  • Collaborate across teams and mentor team members.
Qualifications
  • Experience building and supporting mission-critical datastores.
  • Proficiency in coding with Python, Go, or similar languages.
  • Strong Infrastructure as Code skills with Terraform and Kubernetes.
  • Knowledge of cloud managed services, ideally AWS.
  • Understanding of Unix/Linux systems and protocols.
  • Excellent communication and documentation skills.
Additional Information

We value diversity and encourage applicants from all backgrounds to apply, even if they do not meet every qualification. The salary range for this role is $176,000 - $314,200 USD. The role is full-time, mid-senior level, in the Engineering and IT domain, focusing on computer networking products.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal Site Reliability Engineer, Datastores

ThousandEyes (part of Cisco)

San Francisco

On-site

USD 176.000 - 315.000

4 days ago
Be an early applicant

Principal Site Reliability Engineer

Atlassian

Aurora

Remote

USD 170.000 - 275.000

30+ days ago

Principal Site Reliability Engineer, Datastores (ThousandEyes)

Cisco Systems

San Francisco

On-site

USD 198.000 - 283.000

30+ days ago