Enable job alerts via email!

Sr. Site Reliability Engineer: Splunk Cloud Services

Splunk

North Carolina

Remote

USD 90,000 - 150,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Senior Site Reliability Engineer to lead the design and build of their next-generation cloud services. In this pivotal role, you will engage with various teams to implement modern SRE practices, ensuring robust and scalable cloud offerings. Your expertise in Kubernetes, distributed systems, and performance tuning will be crucial as you work in a fully remote environment. This is a fantastic opportunity to shape engineering culture and contribute to innovative solutions that enhance user experience. Join a passionate team dedicated to making machine data accessible and valuable!

Qualifications

  • Experience with regulated computing environments like FISMA or FedRAMP.
  • Proficient in Kubernetes and its ecosystems, with certifications as a plus.
  • Strong understanding of Linux systems and networking.

Responsibilities

  • Own Splunk Cloud in FedRAMP environments and deliver quality products.
  • Work with engineering teams to build a cloud-based environment.
  • Mentor junior engineers and enhance team performance.

Skills

Kubernetes
GoLang
Python
Linux Systems
Networking
Chaos Engineering
DevOps
Performance Tuning

Tools

AWS
Azure
GCP
Splunk

Job description

Sr. Site Reliability Engineer: Splunk Cloud Services
Job Description

Join us as we pursue our exciting vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and most significantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!

Role:

Splunk's Cloud Services group is looking for a Senior Site Reliability Engineer to help lead, design and build the next generation of our large scale cloud offering. You will be working on core services and applications that form the primitives for our current and future cloud service offerings. Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to instruct and implement modern interpretations of SRE, observability, Chaos Engineering and DevOps. This role is highly visible and impactful to the organization and will help shape Splunk's Engineering culture for years to come. Your job, in a nutshell, is to make every team around you better... including your own!

You will:

  • Own Splunk Cloud in FedRAMP environments.
  • Work across the organization to deliver quality products that delight Splunk's passionate users.
  • Work with teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
Qualifications:
  • Experience working with regulated computing environments such as FISMA and/or FedRAMP and are enthusiastic about doing it better.
  • This is a fully remote, US-based/work-from-home position so you must be a US Citizen working on US soil to be considered.
  • Worked with Kubernetes, EKS, GKE or AKS and the associated ecosystems. Kubernetes' certifications or interest in acquiring these certifications are a plus, such as those from the Cloud Native Computing Foundation; Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), or Certified Kubernetes Security Specialist (CKS).
  • Enjoy building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.
  • Have a good understanding of linux systems (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
  • Experience with at least one programming language, preferably GoLang (go) or python. Knowledge of working with and automating linux systems tasks using this language is required, including working with configuration files and system services. Knowledge of common data structures and algorithms, as well as their performance characteristics is required.
  • Knowledge of standard methodologies related to security, performance, and disaster recovery.
  • Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
  • Assembled Open Source components into cohesive services.
  • Designed and developed systems and/or software features.
  • Experience mentoring junior engineers.
Preferred skills:
  • Experience monitoring cloud environments with Splunk.
  • Experience with development and deployment in a hosted cloud environment, preferably AWS, Azure or GCP. Cloud certifications are a plus or an interest in obtaining these certifications, such as AWS Certified Solutions Architect, AWS Certified DevOps Engineer, or Google Associate Cloud Engineer (ACE).
  • Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.
  • Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer - Remote US

Lensa

Raleigh

Remote

USD 115,000 - 135,000

2 days ago
Be an early applicant

Senior Site Reliability Engineer

Censys, Inc.

Ann Arbor

Remote

USD 145,000 - 195,000

Yesterday
Be an early applicant

Senior Site Reliability Engineer

ZipRecruiter

Raleigh

Remote

USD 128,000 - 193,000

24 days ago

FlightAware- Sr. Site Reliability Engineer (Remote)

Pratt & Whitney

Remote

USD 101,000 - 203,000

4 days ago
Be an early applicant

FlightAware- Sr. Site Reliability Engineer (Remote)

Lensa

Austin

Remote

USD 101,000 - 203,000

2 days ago
Be an early applicant

[Hiring] Senior Site Reliability Engineer @SoFi

SoFi

Remote

USD 120,000 - 160,000

5 days ago
Be an early applicant

Splunk Engineer

TEKsystems

Raleigh

Remote

USD 80,000 - 100,000

Today
Be an early applicant

Senior Reliability Engineer

JLL

Chicago

Remote

USD 120,000 - 140,000

Yesterday
Be an early applicant

Senior Site Reliability Engineer - Azure - Remote

Optum

Eden Prairie

Remote

USD 89,000 - 177,000

4 days ago
Be an early applicant