Enable job alerts via email!

Site Reliability Engineer

HappyRobot

San Francisco (CA)

On-site

USD 105,000 - 250,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a high-growth AI startup as a Site Reliability Engineer, where you will be essential in enhancing system stability and operational resilience. This role offers an opportunity to take ownership and influence reliability practices in a fast-paced, innovative environment backed by top-tier investors.

Benefits

Top-Tier Compensation
Fast Growth Opportunity
Ownership & Autonomy in Projects
Work with a World-Class Team

Qualifications

  • 1+ years of hands-on experience debugging production systems.
  • Comfort with Python and Go for reading code.
  • Familiarity with observability and monitoring tools.

Responsibilities

  • Own stability, observability, and debugging workflows.
  • Design tools to improve operational resilience.
  • Untangle complex failures in real time.

Skills

Problem-Solving
Debugging
Python
Go
Observability Tools
Clear Communication

Tools

Datadog
Prometheus
Sentry

Job description

Join to apply for the Site Reliability Engineer role at HappyRobot

Join to apply for the Site Reliability Engineer role at HappyRobot

About HappyRobot

HappyRobot is a platform to build and deploy

About HappyRobot

HappyRobot is a platform to build and deploy AI workers that automate communication. See a demo

Our AI workers connect to any system or data source to handle phone calls, email, messages…

We target the logistics industry which relies heavily on communication to book, check on, & pay for freight. Primarily working with freight brokers, 3PLs, freight forwarders, shippers, warehouses, & other supply chain enterprises and tech startups.

We raised a Series A round from a16z and YC and we’re growing very fast.

We're looking for rockstars with a relentless drive, unstoppable energy, and a true passion for building something great—ready to embrace the challenge, push limits, and thrive in a fast-paced, high-intensity environment.

About The Role

We're looking for a Site Reliability Engineer to take the lead on scaling our operational resilience as we grow. You’ll own the stability, observability, and debugging workflows that keep our systems running smoothly. You'll be the go-to person for untangling complex failures in real time, designing tools that turn chaos into clarity, and helping us shift from reactive to proactive operations.

This is a high-impact, high-trust role where you’ll shape how reliability is done - reducing incident load, building internal tooling, and directly improving developer focus and system uptime. If you love getting to the root of hard problems and making systems (and teams) stronger, this is your moment.

Must-Have

  • 1+ years of hands-on experience debugging production systems (logs, traces, incidents, etc.)
  • Strong problem-solving skills and ability to dive into unfamiliar backend codebases
  • Comfort with Python and Go for reading code and writing small tools/utilities
  • Familiarity with observability and monitoring tools (e.g., Datadog, Prometheus, Sentry)
  • Clear, calm communication under pressure — especially during live incidents

Nice-to-Have

  • Experience working with distributed systems or services at scale
  • Built or maintained internal tooling for on-call teams or reliability workflows
  • Familiarity with deployment pipelines, CI/CD, or infra-as-code
  • Experience improving system observability (e.g., custom metrics, traces, log pipelines)

Why join us?

  • Opportunity to work at a high-growth AI startup, backed by top investors.
  • Fast Growth - Backed by a16z and YC, on track for double-digit ARR.
  • Top-Tier Compensation - Competitive salary + equity in a high-growth startup.
  • Ownership & Autonomy - Take full ownership of projects and ship fast.
  • Work With the Best - Join a world-class team of engineers and builders.

The personal data provided in your application and during the selection process will be processed by Happyrobot, Inc., acting as Data Controller.

By sending us your CV, you consent to the processing of your personal data for the purpose of evaluating and selecting you as a candidate for the position. Your personal data will be treated confidentially and will only be used for the recruitment process of the selected job offer.

In relation to the period of conservation of your personal data, these will be eliminated after three months of inactivity in compliance with the GDPR and legislation on the protection of personal data.

If you wish to exercise your rights of access, rectification, deletion, portability or opposition in relation to your personal data, you can do so through security@happyrobot.ai subject to the GDPR.

For more information, visit https://www.happyrobot.ai/privacy-policy

By submitting your request, you confirm that you have read and understood this clause and that you agree to the processing of your personal data as described.

Seniority level
  • Seniority level
    Entry level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Engineering and Information Technology
  • Industries
    Software Development

Referrals increase your chances of interviewing at HappyRobot by 2x

Get notified about new Site Reliability Engineer jobs in San Francisco, CA.

Novato, CA $98,400.00-$145,620.00 7 hours ago

Hayward, CA $100,000.00-$150,000.00 5 months ago

San Francisco, CA $150,000.00-$250,000.00 11 months ago

Software Engineer, Deployment & Observability

San Francisco, CA $150,000.00-$250,000.00 5 days ago

Site Reliability Engineer (SRE, Remote US)

San Francisco, CA $120,000.00-$160,000.00 3 months ago

San Francisco, CA $119,000.00-$161,000.00 6 days ago

San Francisco, CA $125,000.00-$175,000.00 2 weeks ago

San Francisco, CA $120,000.00-$150,000.00 3 hours ago

San Francisco, CA $105,600.00-$198,000.00 4 days ago

Platform Engineer — Infra / Reliability Specialist

San Francisco, CA $150,000.00-$300,000.00 9 months ago

San Mateo, CA $150,000.00-$185,000.00 1 day ago

San Mateo, CA $150,000.00-$185,000.00 1 day ago

San Francisco, CA $130,000.00-$250,000.00 2 weeks ago

Foster City, CA $160,000.00-$190,000.00 3 months ago

San Mateo, CA $150,000.00-$185,000.00 1 day ago

San Mateo, CA $150,000.00-$185,000.00 1 day ago

San Francisco, CA $145,000.00-$195,000.00 2 months ago

San Francisco, CA $133,800.00-$200,600.00 6 days ago

San Francisco, CA $150,000.00-$300,000.00 9 months ago

San Francisco, CA $135,000.00-$159,000.00 22 hours ago

San Francisco, CA $200,000.00-$240,000.00 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer, Resware

Menlo Ventures

San Francisco null

Remote

Remote

USD 130,000 - 140,000

Full time

5 days ago
Be an early applicant

Site Reliability Engineer (Data)

Zapier

San Francisco null

Remote

Remote

USD 120,000 - 160,000

Full time

Yesterday
Be an early applicant

Site Reliability Engineer - Remote

PayNearMe

Santa Clara null

Remote

Remote

USD 175,000 - 195,000

Full time

Yesterday
Be an early applicant

Remote Senior Site Reliability Engineer (SRE) - Zetachain

Blockchain Works

San Francisco null

Remote

Remote

USD 120,000 - 160,000

Full time

7 days ago
Be an early applicant

Site Reliability Engineer

1872 Consulting

Redwood City null

Remote

Remote

USD 120,000 - 175,000

Full time

2 days ago
Be an early applicant

Site Reliability Engineer

WorkOS

San Francisco null

Remote

Remote

USD 175,000 - 250,000

Full time

7 days ago
Be an early applicant

Software Engineer, Site Reliability (Senior or Staff)

Recruiting From Scratch

San Francisco null

Remote

Remote

USD 175,000 - 225,000

Full time

8 days ago

Site Reliability Engineer - Remote

ZipRecruiter

Santa Clara null

Remote

Remote

USD 175,000 - 195,000

Full time

14 days ago

Site Reliability Engineer III (Remote)

IDEMIA

null null

Remote

Remote

USD 93,000 - 117,000

Full time

Today
Be an early applicant