Enable job alerts via email!

Senior Site Reliability Engineer (Remote)

Jobgether

United States

Remote

USD 120,000 - 160,000

Full time

Yesterday

Be an early applicant

Job summary

A technology recruitment platform seeks a Senior Site Reliability Engineer in the United States. This role emphasizes high availability and performance for SaaS products, with responsibilities including automation, incident response, and infrastructure management. Applicants should have over 5 years of experience in software development, cloud infrastructure, and tools like Azure, Jenkins, and Kubernetes. The position offers a competitive salary and a flexible workplace culture.

Benefits

Competitive salary and total rewards package

Flexible and remote-friendly work culture

Comprehensive health, wellness, and retirement benefits

Opportunities for growth and continuous learning

Paid time off and leave programs

Qualifications

5+ years of software development experience.
Proven experience supporting production SaaS systems.
Strong problem-solving skills in a distributed team.

Responsibilities

Champion and implement a culture of Site Reliability Engineering.
Design and maintain monitoring and incident response systems.
Automate operational processes including deployments.
Participate in 24/7 on-call rotations and lead incident analysis.

Skills

Software development experience in C# .NET or Java

Creating automated deployments with Azure DevOps, Ansible, or Jenkins

Managing cloud infrastructure in Azure

Implementing monitoring with New Relic, Dynatrace, or DataDog

Scripting in PowerShell, Python, or Bash

DevOps focus and Infrastructure as Code

Database performance monitoring and optimization

Containerization and Kubernetes

Strong collaboration and communication skills

Education

BS in Computer Science or equivalent experience

Tools

Azure DevOps

Ansible

Jenkins

Terraform

New Relic

Dynatrace

DataDog

Kubernetes

Overview

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a SeniorSite Reliability Engineer in the United States.

This role offers the opportunity to ensure high availability, stability, and performance for enterprise SaaS products that power critical business operations. You will work closely with engineering and operations teams to automate, monitor, and optimize infrastructure at scale while maintaining service level objectives (SLOs) and agreements (SLAs). Success in this role requires a balance of software engineering expertise and infrastructure management, with a strong focus on automation, performance, and operational excellence. You will lead initiatives to improve deployment pipelines, enhance system reliability, and proactively prevent production issues, all while collaborating across distributed teams. This is an ideal position for someone who thrives in fast-paced environments and enjoys solving complex reliability challenges.

Accountabilities

Champion and implement a culture of Site Reliability Engineering to maintain a robust SaaS platform.
Design, implement, and maintain monitoring, alerting, and incident response systems to ensure system availability, performance, and scalability.
Automate operational processes, including runbooks, deployments, and infrastructure tasks.
Optimize application performance at scale and troubleshoot complex technical issues.
Define and support CI/CD pipelines aligned with quality assurance and deployment strategies.
Participate in 24/7 on-call rotations and lead triage and root cause analysis (RCA) for production incidents.
Collaborate closely with engineering teams to promote best practices, eliminate bottlenecks, and improve operational efficiency.
Stay current with new tools, technologies, and strategies, integrating improvements into workflows and processes.

Qualifications

5+ years of software development experience in languages such as C# .NET or Java.
5+ years experience creating automated deployments with tools like Harness, Azure DevOps, Ansible, or Jenkins.
5+ years experience managing cloud infrastructure as a global admin in Azure, including cost management.
5+ years experience implementing performance, availability, and scalability monitoring using tools such as New Relic, Dynatrace, DataDog, or AppDynamics.
5+ years experience scripting in PowerShell, Python, or Bash for Windows/Linux automation.
Proven experience supporting production, client-facing, revenue-generating SaaS systems.
Strong DevOps focus and experience with Infrastructure as Code (Terraform or similar).
Knowledge of database performance monitoring and optimization (SQL, Cosmos DB).
Experience securing and operating Windows or Linux systems in a 24x7 environment.
Experience with containerization and Kubernetes (AKS or EKS), cloud networking, firewalls, and load balancing.
BS in Computer Science or equivalent experience.
Strong collaboration, problem-solving, and communication skills in a distributed team environment.

Benefits

Competitive salary and total rewards package.
Flexible and remote-friendly work culture.
Comprehensive health, wellness, and retirement benefits.
Opportunities for professional growth and continuous learning.
Paid time off and leave programs for work-life balance.
Collaborative, values-driven team environment recognized for employee satisfaction.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Senior Site Reliability Engineer (Remote)

Jobgether

United States

Remote

USD 120,000 - 160,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Company

Services

Free resources

Support

Senior Site Reliability Engineer (Remote)

Jobgether

United States

Remote

USD 120,000 - 160,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Follow us

Company

Services

Free resources

Support