Enable job alerts via email!

Site Reliability Engineer

Two Barrels LLC

United States

Remote

USD 148,000 - 175,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Site Reliability Engineer to enhance their operational systems and ensure seamless performance. You will work remotely or in select cities to build automation tools, fix issues proactively, and drive a culture of reliability and continuous improvement. This role offers an attractive salary along with comprehensive benefits, great work-life balance, and an opportunity to be a vital part of a dynamic engineering team.

Benefits

Great Wage & Success Meetings
Work From Home comfort package
22 days paid time off annually
Up to 5% 401k employer matching
100% employer-paid medical, dental, and vision
Maternity and Paternity Leave
Flexible hours

Qualifications

  • 5+ years of experience in software engineering.
  • 2+ years in site reliability engineering or DevOps.
  • Deep experience with infrastructure as code tools.

Responsibilities

  • Build tools and automation to enhance system reliability.
  • Respond to incidents and drive improvements.
  • Ensure systems are steady, secure, and efficiently maintained.

Skills

Cloud platforms
Site reliability engineering
Kubernetes
Docker
Monitoring tools
Database optimization
Performance tuning

Education

Bachelor's degree in Computer Science

Tools

Terraform
CloudFormation
Prometheus
Grafana

Job description

Overview:
Two Barrels is looking for a Site Reliability Engineer who can help keep our systems steady, secure, and running like a well-oiled machine (except without actual oil). You'll work closely with our DevOps engineers to build out tools and automation that make things faster, easier, and less painful for everyone.
Your main job? Stop problems before they start. And when something does break (because let's be real-it will), help us fix it quickly and learn from it so we don't do the same dumb thing twice. We're big on taking ownership here. You won't get blamed for something going wrong-but you will be expected to help make it right.
If you like digging into weird errors, thinking ahead, and making things just work-even when no one notices-this might be your kind of thing.
Location:
Remote | Spokane, WA | Salt Lake City, UT | Austin, TX
Duration:
Full Time
Wage:
up to $175,000/ Year
Minimum Qualifications:
  • Bachelor's degree in Computer Science, Software Engineering, or equivalent practical experience.
  • 5+ years of experience in software engineering.
  • 2+ years of experience in site reliability engineering, DevOps, or infrastructure engineering roles.
  • Deep experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code tools such as Terraform, CloudFormation, or Pulumi.
  • Strong proficiency with Kubernetes, Docker, and container orchestration in production environments.
  • Hands-on experience with observability and monitoring tools like Prometheus, Grafana, OpenTelemetry, Sentry, or New Relic.
  • Proven ability to design and implement highly available, fault-tolerant systems and lead proactive incident response efforts.
  • Experience with performance tuning, database optimization, and caching strategies (e.g., PostgreSQL, Redis, Memcached).
  • Demonstrated ability to drive reliability improvements, reduce operational toil, and foster a culture of resilience and continuous improvement.

Preferred Qualifications:
  • Experience leading reliability-focused initiatives such as post-incident reviews, capacity planning, and root cause analysis.
  • Experience in site reliability engineering within Ruby on Rails environments.
  • Familiarity with the Grafana observability stack and related tools (e.g., Alloy, Loki, Tempo, Prometheus).
  • In-depth experience with AWS services, including ECS, EKS, Route 53, and other related tools.
  • Proven ability to collaborate across teams to improve service reliability, reduce incident frequency, and drive operational excellence.
  • Troubleshoot and resolve complex production issues, applying SRE best practices to minimize impact and prevent recurrence.
  • Continuously drive improvements in operational efficiency and system resilience.

Why you might like this job:
You like when things work-and you're the kind of person who quietly fixes things while everyone else is still yelling "It's broken!" You think alerts should be useful, not just annoying background noise, and you enjoy building systems that mostly run themselves (because babysitting servers isn't your idea of fun).
You probably have a bit of a tinkerer's soul. Maybe you've automated your coffee maker or built a Raspberry Pi just to turn your lights purple. You appreciate clean logs, quiet dashboards, and sleep that isn't interrupted by 3AM calls.
You want to work somewhere that's weird in a good way-where you're trusted to do your job, encouraged to ask "why?", and no one makes you sit through a meeting about synergy.
If that all sounds oddly satisfying, this might be the job for you.
Benefits:
  • Great Wage & Success Meetings with your manager
  • Work From Home comfort package & company provided equipment
  • 22 days paid time off annually, PLUS 4 paid holidays
  • Up to 5% 401k employer matching through Fidelity
  • 100% employer-paid medical, dental and vision for employees
  • Maternity and Paternity Leave
  • Flexible hours
  • Coffee shop next door
  • Crappy parking? Oh, I mean a cool downtown location for easy public transportation options...
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer II - Remote

Lensa

Remote

USD 83,000 - 175,000

3 days ago
Be an early applicant

Site Reliability Engineer

Beazley

Remote

USD 110,000 - 150,000

3 days ago
Be an early applicant

Senior Site Reliability Engineer, Infrastructure

Fleetio

Remote

USD 120,000 - 160,000

5 days ago
Be an early applicant

Site Reliability Engineer

Seer

Remote

USD 100,000 - 300,000

10 days ago

Principal Network Site Reliability Engineer - OCI (REMOTE)

Oracle Database

Remote

USD 97,000 - 200,000

8 days ago

Site Reliability Engineer

Ford Motor Company

Remote

USD 120,000 - 160,000

8 days ago

Principal Network Site Reliability Engineer - OCI (REMOTE)

Oracle Cloud ERP

Remote

USD 97,000 - 200,000

8 days ago

Site Reliability Engineer

Offchain Labs

Remote

USD 100,000 - 720,000

7 days ago
Be an early applicant

Site Reliability Engineer

OnePay

Remote

USD 100,000 - 150,000

8 days ago