Enable job alerts via email!

Senior Site Reliability Engineer (SRE)

Metasys Technologies

Las Vegas (NV)

Remote

USD 90,000 - 150,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Senior Site Reliability Engineer (SRE) in a remote position. This role focuses on ensuring the reliability, availability, and scalability of systems while collaborating with development, operations, and security teams. You will design and implement robust monitoring systems, automate operational tasks, and manage cloud infrastructure using tools like Kubernetes and Terraform. Your expertise will drive performance tuning, incident management, and CI/CD practices, contributing to the overall success of the organization. If you're passionate about system reliability and eager to make a significant impact, this is the perfect opportunity for you.

Qualifications

  • Strong experience in Linux/Unix and cloud platforms (AWS, Azure, GCP).
  • Expertise in automation tools and incident management.

Responsibilities

  • Ensure high availability and reliability of applications and infrastructure.
  • Develop and maintain automation tools for deployments and configurations.

Skills

Linux/Unix System Administration
Kubernetes
Docker
Cloud Platforms (AWS, Azure, GCP)
Terraform
Ansible
Python
Bash
Go
Incident Management

Tools

Prometheus
Grafana
ELK Stack
Datadog
Splunk
Jenkins
GitHub Actions
Terraform
Ansible
CloudFormation

Job description

Senior Site Reliability Engineer (SRE)
Remote Position
6+Month Contract

Client is seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team and help ensure the reliability, availability, and scalability of our systems. As a Senior SRE, you will work closely with development, operations, and security teams to build, monitor, and improve infrastructure and application performance while implementing best practices in automation and incident management.

Key Responsibilities:

System Reliability & Performance

  • Ensure high availability and reliability of applications and infrastructure.
  • Design and implement robust monitoring, logging, and alerting systems.
  • Conduct performance tuning and capacity planning to optimize system efficiency.

Automation & Infrastructure as Code (IaC)

  • Develop and maintain automation tools to manage deployments and configurations.
  • Implement Infrastructure as Code (IaC) using tools like Terraform, Ansible, or CloudFormation.
  • Automate manual operational tasks to improve efficiency and reduce downtime.

Incident Management & Troubleshooting

  • Participate in on-call rotations to quickly resolve incidents and prevent recurrence.
  • Perform root cause analysis (RCA) for production incidents and drive post-mortem reviews.
  • Develop and document runbooks to standardize response procedures.

DevOps & CI/CD

  • Work closely with development teams to implement CI/CD pipelines for faster and safer deployments.
  • Optimize build and deployment workflows using Jenkins, GitHub Actions, or similar tools.
  • Ensure security and compliance best practices are embedded in the deployment process.

Cloud & Infrastructure Management

  • Manage and optimize cloud-based infrastructure (AWS, Azure, GCP).
  • Implement container orchestration solutions using Kubernetes and Docker.
  • Ensure security best practices for cloud-based environments, including IAM and network security.
Required Skills & Qualifications:

Technical Expertise

  • Strong experience in Linux/Unix system administration.
  • Hands-on experience with Kubernetes, Docker, and cloud platforms (AWS, Azure, or GCP).
  • Proficiency in Terraform, Ansible, or CloudFormation for Infrastructure as Code.

Monitoring & Observability

  • Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, Datadog, or Splunk.

Automation & Scripting

  • Strong scripting skills in Python, Bash, or Go.
  • Expertise in automating operational tasks and workflows.

Incident Management & Troubleshooting

  • Ability to analyze system failures and implement preventive solutions.
  • Experience with incident response and root cause analysis.

CI/CD & DevOps Practices

  • Experience with CI/CD tools such as Jenkins, GitLab CI/CD, or GitHub Actions.
  • Familiarity with GitOps methodologies and release automation.

Security & Compliance

  • Knowledge of network security, IAM, and compliance frameworks like SOC2, ISO27001.
Preferred Qualifications:
  • Experience in SaaS, fintech, or high-scale distributed systems.
  • Certifications in AWS, Kubernetes (CKA/CKAD), or Terraform.
  • Familiarity with service mesh technologies like Istio or Linkerd.

Metasys Technologies is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identify, national origin, veteran or disability status.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Sr. Data Reliability Engineer (Remote)

CrowdStrike

Las Vegas

Remote

USD 110,000 - 180,000

14 days ago

Senior Site Reliability Engineer

Censys

Remote

USD 145,000 - 195,000

Yesterday
Be an early applicant

Senior Site Reliability Engineer (US Shift)

AlphaSense, Inc.

Mission

Remote

USD 120,000 - 160,000

Yesterday
Be an early applicant

Senior Site Reliability Engineer (US Shift)

AlphaSense

Remote

USD 120,000 - 160,000

-1 days ago
Be an early applicant

Senior Site Reliability Engineer

Runwise

Remote

Remote

USD 140,000 - 190,000

Today
Be an early applicant

Senior Site Reliability Engineer

Censys

Ann Arbor

Remote

USD 145,000 - 195,000

5 days ago
Be an early applicant

Senior Site Reliability Engineer

MongoDB

Remote

USD 127,000 - 249,000

Today
Be an early applicant

Senior Site Reliability Engineer

General Motors of Canada

Remote

USD 100,000 - 130,000

2 days ago
Be an early applicant

Senior Site Reliability Engineers

Centene Corporation

St. Louis

Remote

USD 112,000 - 159,000

3 days ago
Be an early applicant