Enable job alerts via email!

Senior Site Reliability Engineer (Remote)

Jobgether

United States

Remote

USD 120,000 - 160,000

Full time

Yesterday
Be an early applicant

Job summary

A technology recruitment platform seeks a Senior Site Reliability Engineer in the United States. This role emphasizes high availability and performance for SaaS products, with responsibilities including automation, incident response, and infrastructure management. Applicants should have over 5 years of experience in software development, cloud infrastructure, and tools like Azure, Jenkins, and Kubernetes. The position offers a competitive salary and a flexible workplace culture.

Benefits

Competitive salary and total rewards package
Flexible and remote-friendly work culture
Comprehensive health, wellness, and retirement benefits
Opportunities for growth and continuous learning
Paid time off and leave programs

Qualifications

  • 5+ years of software development experience.
  • Proven experience supporting production SaaS systems.
  • Strong problem-solving skills in a distributed team.

Responsibilities

  • Champion and implement a culture of Site Reliability Engineering.
  • Design and maintain monitoring and incident response systems.
  • Automate operational processes including deployments.
  • Participate in 24/7 on-call rotations and lead incident analysis.

Skills

Software development experience in C# .NET or Java
Creating automated deployments with Azure DevOps, Ansible, or Jenkins
Managing cloud infrastructure in Azure
Implementing monitoring with New Relic, Dynatrace, or DataDog
Scripting in PowerShell, Python, or Bash
DevOps focus and Infrastructure as Code
Database performance monitoring and optimization
Containerization and Kubernetes
Strong collaboration and communication skills

Education

BS in Computer Science or equivalent experience

Tools

Azure DevOps
Ansible
Jenkins
Terraform
New Relic
Dynatrace
DataDog
Kubernetes
Job description
Overview

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a SeniorSite Reliability Engineer in the United States.

This role offers the opportunity to ensure high availability, stability, and performance for enterprise SaaS products that power critical business operations. You will work closely with engineering and operations teams to automate, monitor, and optimize infrastructure at scale while maintaining service level objectives (SLOs) and agreements (SLAs). Success in this role requires a balance of software engineering expertise and infrastructure management, with a strong focus on automation, performance, and operational excellence. You will lead initiatives to improve deployment pipelines, enhance system reliability, and proactively prevent production issues, all while collaborating across distributed teams. This is an ideal position for someone who thrives in fast-paced environments and enjoys solving complex reliability challenges.

Accountabilities
  • Champion and implement a culture of Site Reliability Engineering to maintain a robust SaaS platform.
  • Design, implement, and maintain monitoring, alerting, and incident response systems to ensure system availability, performance, and scalability.
  • Automate operational processes, including runbooks, deployments, and infrastructure tasks.
  • Optimize application performance at scale and troubleshoot complex technical issues.
  • Define and support CI/CD pipelines aligned with quality assurance and deployment strategies.
  • Participate in 24/7 on-call rotations and lead triage and root cause analysis (RCA) for production incidents.
  • Collaborate closely with engineering teams to promote best practices, eliminate bottlenecks, and improve operational efficiency.
  • Stay current with new tools, technologies, and strategies, integrating improvements into workflows and processes.
Qualifications
  • 5+ years of software development experience in languages such as C# .NET or Java.
  • 5+ years experience creating automated deployments with tools like Harness, Azure DevOps, Ansible, or Jenkins.
  • 5+ years experience managing cloud infrastructure as a global admin in Azure, including cost management.
  • 5+ years experience implementing performance, availability, and scalability monitoring using tools such as New Relic, Dynatrace, DataDog, or AppDynamics.
  • 5+ years experience scripting in PowerShell, Python, or Bash for Windows/Linux automation.
  • Proven experience supporting production, client-facing, revenue-generating SaaS systems.
  • Strong DevOps focus and experience with Infrastructure as Code (Terraform or similar).
  • Knowledge of database performance monitoring and optimization (SQL, Cosmos DB).
  • Experience securing and operating Windows or Linux systems in a 24x7 environment.
  • Experience with containerization and Kubernetes (AKS or EKS), cloud networking, firewalls, and load balancing.
  • BS in Computer Science or equivalent experience.
  • Strong collaboration, problem-solving, and communication skills in a distributed team environment.
Benefits
  • Competitive salary and total rewards package.
  • Flexible and remote-friendly work culture.
  • Comprehensive health, wellness, and retirement benefits.
  • Opportunities for professional growth and continuous learning.
  • Paid time off and leave programs for work-life balance.
  • Collaborative, values-driven team environment recognized for employee satisfaction.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.