Job Search and Career Advice Platform

Enable job alerts via email!

FedRAMP Site reliability Engineer (Remote - Canada)

Confluent

British Columbia

Hybrid

CAD 90,000 - 120,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading data technology company in British Columbia is seeking a Site Reliability Engineer to manage Confluent Cloud systems for government agencies. This role requires expertise in Cloud Native technologies and Kubernetes, with responsibilities including operational excellence, incident handling, and 24/7 on-call support. Ideal candidates have 3-5 years of relevant experience and a BS degree in a related field.

Qualifications

  • 3-5 years of relevant experience.
  • Expertise in Cloud Native technologies with experience operating production services in the cloud.
  • Deep knowledge of Kubernetes and containerization.

Responsibilities

  • Own and champion high operational standards of Confluent Cloud systems leveraged by federal agencies.
  • Participate in a 24/7 on-call rotation to maintain the integrity of Confluent Cloud for Government systems.

Skills

Cloud Native technologies
Distributed Systems design
Kubernetes
Infrastructure as code
Scripting (Go, Java, Python, Bash)
Telemetry tooling (DataDog, Grafana, Prometheus)
BCP/DR exercises
Critical problem-solving
Exceptional teamwork
Automation

Education

BS Degree in Computer Science, Engineering, or equivalent
Job description

We’re not just building better tech. We’re rewriting how data moves and what the world can do with it. With Confluent, data doesn’t sit still. Our platform puts information in motion, streaming in near real-time so companies can react faster, build smarter, and deliver experiences as dynamic as the world around them.

It takes a certain kind of person to join this team. Those who ask hard questions, give honest feedback, and show up for each other. No egos, no solo acts. Just smart, curious humans pushing toward something bigger, together.

One Confluent. One Team. One Data Streaming Platform.

About the Role:

Do you have a passion for data that can turn events into outcomes, enabling intelligent, real-time apps, and empowering teams and systems to be able to act on data instantly? Have you ever dreamt about the opportunity to work with key agencies of the public sector? Confluent's team of Federal Site Reliability Engineers, will allow you to do just that by putting you in the driver seat to deliver highly performant, reliable systems that enable prominent public sector agencies to make real time decisions with their data to solve real time problems through Confluent Cloud. Confluent Cloud delivers a complete end-to-end streaming experience as a Software as a Service (SaaS) model.

What You Will Do:
  • Understand and participate in the changing FedRAMP space by quickly ramping up with the 20x controls and building upon these to maintain federal compliance

  • Own and champion high operational standards of Confluent Cloud systems leveraged by federal agencies

  • Deploy production changes to Confluent Cloud systems and infrastructure through established change management processes

  • Assist with process improvements and adoption of change management

  • Own monitoring and incident handling of complex distributed systems, engaging engineering teams when needed through an escort model system.

  • Act as a core member of Confluents Business Continuity Plan and Disaster Recovery team with efforts across 3 large verticals

  • Innovate and design solutions to reduce toil, bolster operational maturity, and make day-to-day worklife easier.

  • Participate in a 24/7 on-call rotation to maintain the integrity of Confluent Cloud for Government systems

What You Will Bring:
  • 3-5 years of relevant experience

  • Expertise in Cloud Native technologies with experience operating production services in the cloud

  • Strong fundamentals of Distributed Systems and their design

  • Deep knowledge of Kubernetes and containerization

  • Strong infrastructure as code knowledge (Terraform preferred)

  • Experience with telemetry tooling to monitor production systems (DataDog, Grafana, Prometheus)

  • Experience with BCP/DR and high availability exercises

  • Ability to quickly problem-solve and troubleshoot critical services

  • Proficiency with scripting and automation (e.g Go, Java, Python, Bash)

  • Exceptional teamwork, collaboration skills, and the ability to act critically with minimal supervision at times in a remote first environment

  • Experience with a rotating on-call schedule to provide 24/7 support

  • BS Degree in Computer Science, Engineering, or equivalent experience

Ready to build what's next? Let’s get in motion.
Come As You Are

Belonging isn’t a perk here. It’s the baseline. We work across time zones and backgrounds, knowing the best ideas come from different perspectives. And we make space for everyone to lead, grow, and challenge what’s possible.

We’re proud to be an equal opportunity workplace. Employment decisions are based on job-related criteria, without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other classification protected by law.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.