Enable job alerts via email!

Lead Site Reliability Engineer

RBC

Toronto

On-site

CAD 100,000 - 130,000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A major financial institution in Toronto is seeking a Senior Site Reliability Engineer to enhance their Apigee API Gateway platform's reliability and performance. The role involves troubleshooting, automation, and collaboration with development teams. Ideal candidates should have strong knowledge in API security, cloud technologies, and observability tools. Opportunities for professional growth and a supportive work environment are part of the offer.

Benefits

Competitive compensation
Flexible work/life balance
Development coaching opportunities

Qualifications

  • Production support experience with infrastructure technologies including API Gateway platforms.
  • Hands-on experience with cloud technologies such as OpenShift and Kubernetes.
  • Solid understanding of networking concepts.

Responsibilities

  • Serve as primary operational support for the Apigee API Gateway platform.
  • Build and maintain tools to automate operational processes.
  • Lead incident management and root cause analysis.

Skills

API security (OAuth2.0, JWT)
Cloud technologies (OpenShift, Kubernetes, Azure Kubernetes Service)
Proactive problem-solving
Observability tools (Dynatrace, Splunk, Elastic, Grafana)
Job description

Job Description

What is the Opportunity?
We are expanding our Digital SRE team and are seeking a Senior Site Reliability Engineer (SRE) to join us in supporting our Apigee API Gateway platform. This is an exciting opportunity to work with cutting-edge cloud-native technologies and play a critical role in ensuring the reliability, scalability, and performance of our digital platforms. If you are a problem-solver, a mentor, and a leader who thrives in a fast-paced environment, this role is for you.
As an SRE, you will collaborate with application teams to troubleshoot Apigee-related issues, provide technical guidance, and drive automation to improve operational efficiency. You will also play a key role in shaping our SRE culture and practices, ensuring our systems are robust, secure, and compliant.

What Will You Do?

Platform Support & Troubleshooting:

  • Serve as the primary operational support for the Apigee API Gateway platform, ensuring its reliability, availability, and performance.
  • Assist application teams in troubleshooting and resolving Apigee-related issues, including API design, security, and performance optimization.
  • Manage API lifecycle, including OpenAPI/Swagger specifications, rate limiting, throttling, quota management, and OAuth2.0/JWT authentication.

Automation & Reliability Engineering:

  • Build and maintain tools to automate operational processes, including monitoring, logging, and alerting.
  • Develop and implement SRE solutions to improve system reliability, scalability, and performance.
  • Continuously evaluate and optimize system performance using observability tools like Dynatrace, Splunk, Elastic, and Grafana.

Collaboration & Leadership:

  • Partner with development teams to improve services through rigorous testing, release procedures, and capacity planning.
  • Provide technical leadership by conducting code reviews, publishing technical designs, and mentoring team members.
  • Drive SRE adoption and transformation by organizing engineering mindset meetups and sharing best practices.

Incident & Problem Management:

  • Monitor system health holistically and proactively identify areas for improvement.
  • Lead incident management and root cause analysis for production issues, ensuring lessons learned are applied to prevent recurrence.
  • Maintain compliance and technology currency, including server patching, certificate renewals, and segregation of duties.

What Do You Need to Succeed?

Must-Have Qualifications:

  • Production support experience with infrastructure technologies, including API Gateway platforms like Apigee, Kong, Nginx, or AWS/Azure API Management.
  • Strong expertise in API security (OAuth2.0, JWT), API design (OpenAPI/Swagger), and developer portal management.
  • Experience as an SRE supporting cloud and legacy applications.
  • Hands-on experience with cloud technologies such as OpenShift, Kubernetes, and Azure Kubernetes Service (AKS).
  • Proficiency in observability tools (Dynatrace, Splunk, Elastic, Grafana) and end-to-end application monitoring.
  • Solid understanding of networking concepts, including certificates, load balancers, and DNS.
  • A proactive approach to identifying and solving problems, with a strong focus on automation and innovation.

Nice-to-Have Qualifications:

  • Familiarity with Google SRE best practices.
  • Experience leading and mentoring geo-distributed teams.
  • Strong communication and collaboration skills, with the ability to work effectively across teams and functions.

What’s in it for you?

  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
  • Leaders who support your development through coaching and managing opportunities
  • Ability to make a difference and lasting impact
  • Work in a dynamic, collaborative, progressive, and high-performing team
  • Flexible work/life balance options
  • Opportunities to do challenging work
  • Opportunities to take on progressively greater accountabilities
  • Opportunities to building close relationships with clients

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.