Enable job alerts via email!

Senior Site Reliability Engineer

Censys

United States

Remote

USD 145,000 - 195,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology company is seeking a Senior Site Reliability Engineer to enhance developer efficiency and operational maturity. This fully remote position emphasizes collaboration, tool development for Kubernetes, and support for cloud-native applications. With a focus on improving the engineering workflow, the role involves working closely with development teams to ensure reliability and resilience of services across the organization.

Benefits

401k match
Health insurance
Vision insurance
Dental insurance
Bonus and equity

Qualifications

  • 5+ years of experience in an SRE role or similar.
  • Experience deploying and managing applications in the cloud.
  • Strong communication skills to support developers.

Responsibilities

  • Build and maintain tooling for Kubernetes and Google Cloud Platform.
  • Collaborate with teams to deploy services confidently.
  • Ensure smooth operations of production environments.

Skills

Communication skills
Collaboration
Empathy

Tools

Kubernetes
Terraform
Prometheus
Grafana
Go
Python
Scala

Job description

Location

This is a fully remote position within the United States.

Role Summary

As a Senior Site Reliability Engineer (SRE) on the Infrastructure and Ops platform team, you will help design, build, and deploy the tools used to empower our development teams and production applications. We’re looking for talented engineers to help grow our operational maturity, as well as master cloud-native technologies to build and support our microservice architecture’s growth and reliability.

As a Developer Efficiency and Experience focused SRE, you will be responsible for improving the efficiency of engineering and our development teams by supporting the SDLC and workflows of our developers, including writing supporting application code, automation, and empowering developers to create, deploy, and manage their services end-to-end inside the platform.

What you'll do
  • Build and maintain tooling to support our applications in Kubernetes and in the Google Cloud Platform.
  • Collaborate with development teams to help them build, ship, and deploy services and applications confidently, promoting service resilience and reliability.
  • Ensure smooth operations of our production environments and assist in debugging complex issues, including monitoring the 4 golden signals in our applications.
  • Create a self-service platform by working with the SRE and infrastructure team to accelerate developer velocity, including service catalogs, repository tooling, and documentation. We prioritize a self-service model, treating development teams as internal customers, seeking feedback, and iterating quickly to add value.
  • Participate in a shared on-call rotation, maintaining infrastructure environments and ensuring primary site uptime, with a focus on end-to-end service ownership.
Required Qualifications
  • 5+ years of experience in an SRE role or similar.
  • Experience deploying, managing, and debugging applications in Kubernetes, using Helm and Crossplane.
  • Experience building, securing, and managing container images.
  • Experience with Cloud environments and services such as CloudSQL, Pub/Sub, and Memorystore.
  • Familiarity with Infrastructure-as-code tools like Terraform or Crossplane.
  • Experience with monitoring tools for latency, traffic, errors, and load, such as Prometheus, Grafana, and OpenTelemetry.
  • Familiarity with monorepo, trunk-based development, CI/CD systems like GitHub Actions or ArgoCD, and a desire for Continuous Deployment.
  • Strong communication skills and empathy to support developers, automating and promoting self-service to enhance developer velocity.
Preferred Qualifications
  • Experience with gRPC microservice architecture and Kubernetes Service Mesh (e.g., Istio).
  • Ability to interface with application code, primarily in Go, with some Python and Scala, to implement best practices and shared libraries.
  • Familiarity with application security tools like dependency scanners, static analysis, and linting tools.
  • Comfort with Linux-based environments.
Qualities
  • Passion for clean, concise architecture and GitOps environments.
  • Comfort with projects involving uncertainty and risk.
  • Ability to collaborate with product management and leadership, balancing maintainability and rapid development, with clear BCDR communication.
  • Understanding and practicing continuous delivery principles for quick, safe, and sustainable development.
What will make you stand out
  • Basic knowledge of infrastructure operations, including load balancers, ingress, DNS, and VPC design.
  • Willingness to explore code to understand applications, improve testing, metrics, and reliability.
  • Deep understanding of web application optimization and security, including anti-DDoS and WAF technologies.

Our target salary range for this role is between $145,000 USD and $195,000 USD + bonus and equity.

Our benefits, effective from day one, include 401k match, health, vision, dental, and more! Please see our careers page for details.

This is a fully remote position within the United States.

Note to external recruiters/agencies: We are not currently engaging with third-party agencies for this role and will not accept unsolicited submissions. Please refrain from submitting resumes or profiles.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Site Reliability Engineer

MongoDB

Remote

USD 127,000 - 249,000

6 days ago
Be an early applicant

Senior Site Reliability Engineer New United States - Remote

Motive

Remote

USD 126,000 - 193,000

8 days ago

Senior Site Reliability Engineer ( Remote - US)

Jobgether

Remote

USD 120,000 - 160,000

3 days ago
Be an early applicant

Senior Site Reliability Engineer (Remote)

Fathom - AI Meeting Assistant

Remote

USD 180,000 - 230,000

15 days ago

Senior Site Reliability Engineer

Credit Acceptance

Remote

USD 117,000 - 174,000

18 days ago

Senior Site Reliability Engineer

Zillow Group

Remote

USD 120,000 - 160,000

15 days ago

Senior Site Reliability Engineer

Akamai Technologies

Hybrid

USD 106,000 - 222,000

6 days ago
Be an early applicant

Senior Site Reliability Engineer

Censys, Inc.

Ann Arbor

Remote

USD 145,000 - 195,000

30+ days ago

Senior Site Reliability Engineer

DuckDuckGo

On-site

USD 179,000 - 179,000

5 days ago
Be an early applicant