Enable job alerts via email!

Site Reliability Engineer Lead

Safe Fleet

Port Moody

On-site

USD 100,000 - 125,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Site Reliability Engineer Lead, where your expertise will enhance cloud infrastructure resilience and efficiency. This role offers a unique blend of technical leadership and innovation, focusing on optimizing both infrastructure and application layers to meet service level objectives. You will lead a dynamic team, implement best practices in site reliability, and drive automation projects that significantly improve operational efficiencies. If you are passionate about safety and technology, and want to make a real impact in a fast-growing environment, this opportunity is perfect for you.

Qualifications

  • 5+ years in cloud or DevOps engineering with leadership experience.
  • Strong knowledge of Azure and Kubernetes, with automation skills.

Responsibilities

  • Lead the definition and implementation of SLIs and SLOs.
  • Oversee SRE practices and mentor the incident response team.
  • Drive automation projects for operational efficiencies.

Skills

Cloud Engineering
DevOps Practices
Project Management
Analytical Skills
Communication Skills

Education

Bachelor's Degree in Computer Science or related field

Tools

Azure
Kubernetes
Grafana
Prometheus
Elasticsearch
Terraform
Docker
Jenkins
Azure CLI
PowerShell
Bash

Job description

Meet the Smart Safety Company

At Safe Fleet, our name says it all. We make fleet vehicles – and everyone in and around them – safer. Our fleet safety platform brings together best-in-class products, ground-breaking technology, and a 100-year history of fleet know-how and innovation to solve the world’s biggest fleet safety problems.

Our core value is safety. Without safety first, efficiency and productivity are not possible. This is true for our products, our culture, and our relationship with our community. Our vision is to reduce preventable deaths and injuries in and around fleet vehicles with a goal of ZERO accidents.

We are re-defining what safety means for fleets of every type – from school buses to waste collection trucks, firefighting to utility vehicles, police cruisers to delivery vans.

Whether you work in our Charlotte plant to build life-saving stop arms for school buses, or design advanced camera vision products in our Vancouver office, forge valves and high-quality nozzles to fight fires, or dream up new ways to protect fleet operators in our Corporate HQ in Kansas City, you’ll contribute to our goal to keep everyone safe.

We are a fast-growing manufacturing, service, and technology company with over 1700 employees in over 15 locations across Canada and the US. We’re looking for motivated self-starters with innovative thinking to join our team and help us achieve our growth and performance goals. Sound like you?

Job Summary

As the Site Reliability Engineer Lead at SafeFleet, you will be a key leader in enhancing our cloud infrastructure's resilience and efficiency. This role combines deep technical expertise with leadership responsibilities, ensuring that both infrastructure and application layers are optimized to meet the demands of our service level objectives (SLOs) and agreements (SLAs).

Responsibilities

  • Lead the definition and implementation of Service Level Indicators (SLIs) and Objectives (SLOs) by workflow to enhance product resiliency and reliability.
  • Oversee and improve SRE practices, ensuring system availability, scalability, and observability.
  • Collaborate with software engineering teams to embed effective SRE practices into the development lifecycle.
  • Mentor and lead our 24/7 incident response team, fostering a culture of continuous improvement and technical excellence.
  • Drive the adoption of automation and orchestration projects that improve operational efficiencies and proactive management capabilities.
  • Manage operational incident response, change management, and root cause analysis workflows, setting best practices in these areas.
  • Engage in capacity management and proactive performance tuning to ensure that our services meet the demands of our SLAs.
  • Document and maintain operational procedures, ensuring that they align with best practices and promote knowledge sharing across teams.

Salary: $100,000 to $125,000/yr

Requirements

  • Minimum 5 years of experience in cloud or DevOps engineering, with a proven track record in a leadership or technical lead role.
  • Strong knowledge of cloud services (Azure preferred), container orchestration via Kubernetes, and infrastructure as code practices.
  • Proficiency in monitoring tools such as Grafana, Prometheus, and Elasticsearch.
  • Excellent communication and project management skills, capable of leading projects and initiatives independently.
  • Experience with automation tools and scripting languages such as PowerShell, Bash, Terraform, Kubernetes, Docker, Jenkins, and Azure CLI.
  • Strong analytical skills and the ability to engage with both technical staff and executive-level stakeholders.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Platform Engineering Manager - Ubuntu for Next-Gen Silicon

Canonical

Vancouver

Remote

USD 90,000 - 150,000

10 days ago

Senior Site Reliability Engineer

EIT Professionals Corp

Remote

CAD 80,000 - 120,000

Today
Be an early applicant

Staff Infrastructure Site Reliability Engineer

Remoteworldwide

Remote

CAD 90,000 - 150,000

3 days ago
Be an early applicant

Site Reliability Engineer

Dayforce

Remote

CAD 70,000 - 110,000

4 days ago
Be an early applicant

Software Engineer, Site Reliability (Senior or Staff)

BioRender

Remote

CAD 80,000 - 150,000

7 days ago
Be an early applicant

Intermediate Site Reliability Engineer, Foundations

GitLab

Remote

USD 103,000 - 222,000

10 days ago

Intermediate Site Reliability Engineer, Foundations

GitLab

Remote

CAD 100,000 - 125,000

15 days ago

Search & Reliability Engineer

Creative Market

Vancouver

Remote

CAD 120,000 - 140,000

30+ days ago

Lead Site Reliability Engineer

RBC

Vancouver

On-site

CAD 80,000 - 120,000

9 days ago