Enable job alerts via email!

Site Reliability Engineer

Diverse Lynx

Los Angeles (CA)

Remote

USD 100,000 - 130,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is looking for a Site Reliability Engineer to manage and optimize their infrastructure. The ideal candidate will have expertise in cloud platforms, containerization, and automation. Responsibilities include ensuring high availability, incident response, and optimizing CI/CD pipelines. This full-time remote opportunity offers a chance to work with cutting-edge technologies in a diverse environment.

Qualifications

  • Expertise in cloud platforms, containerization, and automation.
  • Strong troubleshooting and incident management skills.

Responsibilities

  • Implement best practices for high availability and performance.
  • Set up monitoring and lead incident resolution.
  • Develop and maintain Infrastructure as Code tools.

Skills

Cloud platforms
Container technologies
Linux administration
Infrastructure as Code
Monitoring and logging
CI/CD pipelines
Scripting
Troubleshooting

Job description

Site Reliability Engineer

Remote

Full-time Opportunity

Job Description

We are seeking a skilled Site Reliability Engineer (SRE) to manage, optimize, and ensure the reliability of our infrastructure. The ideal candidate will have expertise in cloud platforms, containerization, automation, and monitoring to support scalable and reliable systems.

Must Have Technical/Functional Skills
  1. Experience with Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments.
  2. Proficiency in container technologies (Docker, Podman, Kubernetes).
  3. Strong knowledge of Linux administration and networking concepts.
  4. Experience with Infrastructure as Code (IaC) tools like Terraform, Ansible, Helm, or Pulumi.
  5. Monitoring and logging expertise using Prometheus, Grafana, ELK, Datadog, or Splunk.
  6. Hands-on experience with CI/CD pipelines and DevOps tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD).
  7. Proficiency in scripting/programming (Python, Bash, Go) for automation.
  8. Strong troubleshooting and incident management skills.
Roles & Responsibilities
  1. Reliability & Performance: Implement best practices to ensure high availability, scalability, and performance of containerized applications.
  2. Monitoring & Incident Response: Set up monitoring (Prometheus, Grafana, ELK, Dynatrace, Pagerduty, Powershell), troubleshoot issues, and lead incident resolution.
  3. Automation & Infrastructure as Code (IaC): Develop and maintain Terraform, Helm charts, and Kubernetes manifests for automation.
  4. CI/CD & DevOps Integration: Collaborate with DevOps teams to optimize CI/CD pipelines for Kubernetes deployments.
  5. Security & Compliance: Implement security best practices for containerized workloads, RBAC, network policies, and vulnerability scanning.
  6. Capacity Planning & Optimization: Analyze resource usage and optimize infrastructure costs and performance.
  7. Disaster Recovery & Backup: Implement backup and disaster recovery strategies for Kubernetes workloads.

Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without discrimination. We promote and support a diverse workforce across all levels in the company.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer

First American

California

Remote

USD 82,000 - 111,000

10 days ago

Senior Site Reliability Engineer

Nami Technology Joint Stock Company

Remote

USD 120,000 - 160,000

Yesterday
Be an early applicant

Site Reliability Engineer

Pythian

Remote

USD 90,000 - 150,000

2 days ago
Be an early applicant

Senior Site Reliability Engineer - 2289298

Optum

Eden Prairie

Remote

USD 103,000 - 192,000

Today
Be an early applicant

Senior Site Reliability Engineer - 2289298

UnitedHealth Group

Eden Prairie

Remote

USD 103,000 - 192,000

Today
Be an early applicant

Senior Site Reliability Engineer

Optimism

New York

Remote

USD 120,000 - 160,000

Yesterday
Be an early applicant

Platform - Site Reliability Engineer I (Resilience)

Elastic

California

Remote

USD 102,000 - 137,000

2 days ago
Be an early applicant

Platform - Site Reliability Engineer I (Resilience)

Elastic

Remote

USD 102,000 - 137,000

2 days ago
Be an early applicant

Site Reliability Engineer, Eng Support - USDS

TikTok

Los Angeles

Hybrid

USD 119,000 - 238,000

Today
Be an early applicant