Enable job alerts via email!

Site Reliability Engineer

TieTalent

Los Angeles (CA)

Remote

USD 120,000 - 180,000

Full time

Today
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology firm is seeking a Site Reliability Engineer to manage and optimize infrastructure reliability. The candidate should have robust skills in cloud platforms, container technologies, and automation tools. Strong emphasis on troubleshooting, monitoring, and ensuring system performance is central to this full-time remote role. The opportunity offers a competitive salary suited for mid-senior level professionals in the tech industry.

Qualifications

  • Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments.
  • Proficiency in container technologies (Docker, Container, Podman).
  • Strong knowledge of Linux administration and networking concepts.

Responsibilities

  • Implement best practices to ensure high availability, scalability, and performance of containerized applications.
  • Set up monitoring, troubleshoot issues, and lead incident resolution.
  • Develop and maintain Terraform, Helm charts, and Kubernetes manifests.

Skills

Cloud platforms
Container technologies
Linux administration
Infrastructure as Code (IaC)
Monitoring and logging
CI/CD pipelines
Scripting/programming
Troubleshooting

Job description

Join to apply for the Site Reliability Engineer role at TieTalent

Join to apply for the Site Reliability Engineer role at TieTalent

Get AI-powered advice on this job and more exclusive features.

  • Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments.
  • Proficiency in container technologies (Docker, Container, Podman).
  • Strong knowledge of Linux administration and networking concepts.
  • Experience with Infrastructure as Code (IaC) tools like Terraform, Ansible, Helm, or Pulumi.
  • Monitoring and logging expertise using Prometheus, Grafana, ELK, Datadog, or Splunk.
  • Hands-on experience with CI/CD pipelines and DevOps tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD).
  • Proficiency in scripting/programming (Python, Bash, Go) for automation.
  • Strong troubleshooting and incident management skills.

About

Site Reliability Engineer

Remote

Fulltime Opportunity

Job Description

Site Reliability Engineer

Must Have Technical/Functional Skills

  • Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments.
  • Proficiency in container technologies (Docker, Container, Podman).
  • Strong knowledge of Linux administration and networking concepts.
  • Experience with Infrastructure as Code (IaC) tools like Terraform, Ansible, Helm, or Pulumi.
  • Monitoring and logging expertise using Prometheus, Grafana, ELK, Datadog, or Splunk.
  • Hands-on experience with CI/CD pipelines and DevOps tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD).
  • Proficiency in scripting/programming (Python, Bash, Go) for automation.
  • Strong troubleshooting and incident management skills.

Roles & Responsibilities

We are seeking a highly skilled - Site Reliability Engineer (SRE) to manage, optimize, and ensure the reliability of infrastructure. The ideal candidate will have deep expertise in ELK, Dynatrace Pagerduty. Powershell, container orchestration, cloud infrastructure, and automation, along with a strong focus on reliability, scalability, and performance. Good to have Logic Monitor and Python knowledge

  • Reliability & Performance: Implement best practices to ensure high availability, scalability, and performance of containerized applications.
  • Monitoring & Incident Response: Set up monitoring (Prometheus, Grafana, ELK, Dynatrace, Pagerduty, Powershell etc.), troubleshoot issues, and lead incident resolution.
  • Automation & Infrastructure as Code (IaC): Develop and maintain Terraform, Helm charts, and Kubernetes manifests for automation.
  • CI/CD & DevOps Integration: Work with DevOps teams to optimize CI/CD pipelines for Kubernetes deployments (Jenkins, ArgoCD, FluxCD, etc.).
  • Security & Compliance: Implement security best practices for containerized workloads, RBAC, network policies, and vulnerability scanning.
  • Capacity Planning & Optimization: Analyze resource usage and optimize infrastructure costs and performance.
  • Disaster Recovery & Backup: Implement backup and disaster recovery strategies for Kubernetes workloads.

Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.

Nice-to-have skills

  • AWS
  • Azure
  • Docker
  • Linux
  • Terraform
  • Ansible
  • Prometheus
  • Grafana
  • Splunk
  • Jenkins
  • Gitlab CI
  • Python
  • Bash
  • Go
  • Dynatrace
  • Powershell
  • Kubernetes
  • Los Angeles, California

Work experience

  • Site Reliability (SRE)

Languages

  • English
Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Engineering and Information Technology
  • Industries
    Technology, Information and Internet

Referrals increase your chances of interviewing at TieTalent by 2x

Sign in to set job alerts for “Site Reliability Engineer” roles.
ML Software Engineer (L4/L5) - Media Algorithms

Rancho Dominguez, CA $95,000.00-$130,000.00 3 weeks ago

Santa Fe Springs, CA $66,560.00-$85,000.00 5 months ago

Principal Software Engineer (ML Focused) - League Studio, League Data Central
Site Reliability Engineer, Kubernetes Platform (Starshield)
Staff Software Engineer: GraphQL Platform
Senior Staff Software Engineer, Time Engineering
Customer Engineer, Startups, Google Cloud
Engineer, Software Quality - Senior or Lead
Site Reliability Engineer, Hardware and Infrastructure (Starshield)

Los Angeles, CA $150,000.00-$200,000.00 2 weeks ago

Los Angeles, CA $130,000.00-$190,000.00 2 weeks ago

Sr Software Engineer, Reliability Engineering

Los Angeles, CA $141,000.00-$202,000.00 3 weeks ago

Los Angeles, CA $134,309.00-$148,732.00 4 hours ago

Network Reliability Engineer (L5) Live Broadcast

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer (AKS)

ASCENDING

Irvine null

Remote

Remote

USD 120,000 - 160,000

Full time

9 days ago

Junior Site Reliability Engineer (Remote)

Lensa

null null

Remote

Remote

USD 80,000 - 140,000

Full time

Today
Be an early applicant

Junior Site Reliability Engineer (Remote)

Lensa

null null

Remote

Remote

USD 80,000 - 140,000

Full time

Yesterday
Be an early applicant

Site Reliability Engineer

Noir

null null

Remote

Remote

USD 120,000 - 180,000

Full time

2 days ago
Be an early applicant

Site Reliability Engineer - Remote

PayNearMe

Santa Clara null

Remote

Remote

USD 175,000 - 195,000

Full time

2 days ago
Be an early applicant

Site Reliability Engineer

Jobot

Atlanta null

Remote

Remote

USD 100,000 - 130,000

Full time

Today
Be an early applicant

Site Reliability Engineer

Offchain Labs

null null

Remote

Remote

USD 120,000 - 180,000

Full time

Yesterday
Be an early applicant

Senior Site Reliability Engineer

Filevine

null null

Remote

Remote

USD 130,000 - 170,000

Full time

Today
Be an early applicant

Senior Site Reliability Engineer

Sas

Cary null

Remote

Remote

USD 110,000 - 160,000

Full time

Today
Be an early applicant