Enable job alerts via email!

Site Reliability Engineer

Cerebras

United States

Remote

USD 175,000 - 205,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

One, a leading fintech company, is seeking a Site Reliability Engineer (SRE) to ensure the reliability of critical services. The role involves collaboration with engineering teams to establish practices for reliability and performance, requiring significant experience with distributed systems and observability tools. Join a mission-focused team dedicated to improving financial progress for consumers across the U.S., with flexible and competitive compensation.

Benefits

Competitive cash
Remote friendly
Flexible time off programs
401(k) plan with match
Generous stock option packages

Qualifications

  • 5+ years of experience in distributed cloud native systems.
  • Expertise in observability platforms like Datadog, Splunk, or Grafana.
  • Fluency in Python, Typescript, or Go.

Responsibilities

  • Ensure the availability and reliability of critical services.
  • Collaborate with teams to define SLOs and implement best practices.
  • Participate in a 24x7 on-call rotation.

Skills

Observability
Distributed Systems
Cloud Native
Incident Management
Mentorship

Tools

Datadog
Prometheus
Grafana
Splunk

Job description

About One

One’s mission is simple - to help customers achieve financial progress. We’re doing this by creating simple solutions to help our customers save, spend, borrow, and grow their money – all in one place.

The U.S. consumer today deserves better. Millions of Americans today can’t access credit, build savings or wealth, and are left to manage their financial lives through multiple disconnected apps. Almost a quarter of U.S. adults are unbanked or underbanked and roughly 80% of fintech users rely on multiple accounts to manage their finances.

What makes us unique? We are backed by a preeminent fintech investor (Ribbit) and the world’s largest retailer (Walmart), maintain the speed and independence of a startup, and employ a strong (and growing) collection of world-class talent.

There’s never been a better moment to build a business that helps people achieve financial progress. Come build with us!

The role

As a Site Reliability Engineer (SRE) at One, your mandate is to ensure the availability and reliability of our most critical services, and ensure that they meet the requirements of our customers. Our SRE team at One is growing, so you’ll be a crucial early member to help establish the team, processes, and best practices. Success in this role looks like collaborating with other teams to build and run sustainable production systems that can evolve and adapt to the changes in our fast-paced environment.

This role is responsible for:

  • Working proactively with engineering teams to help them set SLOs and implement best practices for logging and telemetry collection

  • Design, implement and maintain the tools and systems that support service reliability, monitoring, and alerting

  • Participating in a 24x7 on-call rotation supporting the health of our services

  • Driving the incident management process and support a blameless post-mortem culture

  • Participating in application design consulting and capacity planning

  • Defining and formalizing SRE practices and help guide the overall reliability engineering direction

  • Providing mentorship both formally and informally to engineers at One

  • Continuously optimizing systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability

  • Combining software and systems knowledge to engineer high-volume distributed systems in a reliable, scalable, and fault-tolerant manner

You bring

  • 5+ years of relevant industry experience with a focus on distributed cloud native systems design, observability, operation, maintenance, and troubleshooting

  • 5+ years operational experience with an observability platform like Datadog, Splunk, Prometheus/Grafana, or AppDynamics

  • Fluency in one or more programming languages (e.g. Python, Typescript, Go)

  • A strong conviction in software development best practices, including version control, automated testing, and continuous integration and delivery

  • You're self-motivated, inquisitive, and always looking to learn new technologies

  • You’re a great teammate who communicates clearly and transparently

  • The Triple H Factor: Humble, Hungry and Honest

  • An act-like-an-owner mentality. We have a bias toward taking action.

Pay Transparency

The estimated annual base salary for this position ranges from $175,000 to $205,000. Pay is generally based upon the level, complexity, responsibility, and job duties / requirements of the specific position. We then source candidates with the requisite skills, expertise, education, training, and experience. If you are selected for an interview, please feel welcome to speak to a Talent Partner about our compensation philosophy and other available benefits.

What it’s like working @ One
  • Competitive cash

  • Benefits effective on day one

  • Early access to a high potential, high growth fintech

  • Generous stock option packages in an early-stage startup

  • Remote friendly (anywhere in the US) and office friendly - you pick the schedule

  • Flexible time off programs - vacation, sick, paid parental leave, and paid caregiver leave

  • 401(k) plan with match

Leveling Philosophy

In order to thoughtfully scale the company and avoid downstream inequities, we’ve adopted a flat titling structure at One. Though we may occasionally post a role externally with a prefix such as “Senior” to reflect the external level of the position, we do not use prefixes in titles like that internally unless in a position which manages a team. Internal titles typically include your specific functional responsibility, such as engineering, product management or sales, and often include additional descriptors to ensure clarity of role and placement within our organization (i.e. “Engineer, Platform”, “Sales, Business Development” or “Manager, Talent”). Employees are paid commensurate with their experience and the internal level within One.

Inclusion & Belonging

To build technology and products that are used and loved by people and solve real-world problems, we need to build a team with many different perspectives and experiences. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us at talent@one.app.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal Network Site Reliability Engineer - OCI (REMOTE)

Oracle Database

Remote

USD 97,000 - 200,000

7 days ago
Be an early applicant

Principal Network Site Reliability Engineer - OCI (REMOTE)

Oracle Cloud ERP

Remote

USD 97,000 - 200,000

7 days ago
Be an early applicant

Site Reliability Engineer

Offchain Labs

Remote

USD 100,000 - 720,000

7 days ago
Be an early applicant

Staff Site Reliability Engineer

Wikimedia Foundation

Remote

USD 129,000 - 201,000

6 days ago
Be an early applicant

Site Reliability Engineer

Seer

Remote

USD 100,000 - 300,000

9 days ago

Senior Site Reliability Engineer

MongoDB

Remote

USD 127,000 - 249,000

11 days ago

HPC Site Reliability Engineer

Trust In SODA

San Francisco

Remote

USD 200,000 - 220,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

Censys

Remote

USD 145,000 - 195,000

7 days ago
Be an early applicant

Lead Grid Reliability Engineer - Battery Storage

Plus Power

Remote

USD 150,000 - 180,000

2 days ago
Be an early applicant