Enable job alerts via email!

Site Reliability Engineer (SRE)

Tangerine Bank

Toronto

On-site

CAD 80,000 - 120,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading digital bank in Toronto is seeking a qualified SRE & Production Support professional to enhance their technology solutions. You will manage team workflows, ensure timely resolution of production issues, and be integral in system performance optimization. The ideal candidate has strong problem-solving skills and experience with cloud-based software stacks, particularly Google Cloud Platform (GCP). Join us to redefine banking for Canadians.

Benefits

Access to online and in-person courses
Flexible workspace
Diverse and inclusive team environment
Comprehensive benefits package

Qualifications

  • 2-4 years of experience in developing and/or supporting complex, large-scale customer-facing platforms.
  • Strong understanding of networking concepts including TCP/IP, DNS, and HTTP protocols.
  • Experience with cloud services and platforms, particularly GCP.
  • Familiarity with scripting languages like Ansible or Terraform.

Responsibilities

  • Manage team workflow and maximize business efficiencies.
  • Supervise IT Support team and enhance customer support.
  • Resolve production issues within SLAs and oversee customer requests.
  • Monitor system health and manage application performance.
  • Lead on-call support efforts and incident management.

Skills

Multi-tier applications
Microservices (Docker, Kubernetes)
Cloud monitoring software
Networking concepts (TCP/IP, DNS)
Scripting languages (Ansible, Terraform)
Incident management
Code versioning (Git)
Automated production monitoring
Problem-solving skills
Communication skills

Tools

GCP
Splunk
Ansible
Dynatrace
Service now
JIRA
Confluence
Job description

Press Tab to Move to Skip to Content Link

Select how often (in days) to receive an alert:

Tangerine is Canada’s leading direct bank. We offer flexible and accessible banking options, innovative products, and award-winning Client service. The reason why Tangerine employees come to work each day is to help Canadians live better lives. We focus on making a difference in our communities, and that includes our own internal community. It’s important to us that our employees feel empowered and enthusiastic about belonging to our Orange culture.

As Canada’s leading digital bank, Tangerine technology is at the heart of everything we do. We have redefined what digital banking is and we continue to evolve on what it can be, using technology to create innovative, forward thinking banking solutions with our clients’ needs in mind. We are made up of high performing, curious, energetic and collaborative individuals who thrive in our high trust agile environment to deliver best in class solutions for our customers. We believe in giving people hands-on challenges and the responsibilities that come with them, allowing them to grow, evolve and create opportunities to build their career.

Are you ready to make the change and become part of an established disruptor with the backing of a highly engaged team? If so come join us and help redefine the Canadian banking landscape!

We are looking for a SRE & Production Support role and join our Tangerine’s Production Support and SRE team.

Is this role right for you? In this role you will:
  • Manages the team workflow to maximize business and technical efficiencies. Develop and guide the team members in enhancing their technical capabilities and increasing productivity
  • Supervises IT Support team; assigns and prioritizes production incidents, and problems, trains and coaches IT Support teams on ways to improve customer support; develop staff skills
  • Ensures all production issues are resolved within SLAs, and user requests are completed satisfactorily and that all customer requests are responded to in a timely manner.
  • You’ll be responsible for providing investigation and second level support on client issues, technical issues, system/web site outages and questions from all internal and external application by maintaining, prioritization and addressing to respective Tangerine technology groups and vendors.
  • Youwill run the production environment by monitoring availability and taking a holistic view of system health.
  • You will improve our suite of software solutions' reliability, quality, and time-to-market.
  • Measure and optimize system performance to push our capabilities forward, get ahead of customer needs, and innovate to improve continually
  • You’ll be responsible for maintaining the production applications and day-to-day operational activities, manage escalations and modify established procedures / approaches to suit specific situations including 24 x 7 support and coordination of recovery efforts
  • Participate in defining SLIs, SLOs and SLAs for Enterprise Systems
  • Gather and analyze metrics from both applications and infrastructure to assist in performance tuning and fault finding
  • Partner with development teams to address outstanding tickets and implement permanent fixes
  • Create sustainable systems and services through automation and process improvements.
  • Balance feature development speed and reliability with well-defined service level objectives.
  • Monitor multiple application health and discover opportunities to optimize in a continuously growing large complex hybrid environment.
  • Lead on-call problem escalation and outage recovery effort, not limited to code fixes in presentation and integration layer, but also provide infrastructure level investigation and support where necessary.
  • Lead post-incident technical retrospect to discover and implement remediation actions.
  • You will perform troubleshooting, deploy systems or execute maintenance tasks as necessary to meet the specified SLOs.
Do you have the skills that will enable you to succeed in this role? We'd love to work with you if you have:
  • 2-4 years of experience in developing and/or supporting complex, large-scale customer-facing platforms.
  • Good understanding of multi-tier applications, microservices (Docker, Kubernetes etc.)
  • Experience instrumenting and monitoring cloud hosted software stacks (preferably GCP, Vertex AI,GCE, Network, BigQuery, Cloud SQL)
  • Good understanding of networking concepts: TCP/IP, DNS, HTTP, TLS, OSI Model.
  • Familiarity with Tech Stack is Java/J2EE/SpringBoot/Python/JS NodeJs: FrontEnd IOS, Android native Apps; Deploymnet Runtime: K8s, WebSphere, WebSphere Liberty, NdeJS/TS.
  • Basic knowledge of one or more scripting languages (Ansible, Terraform, Bash etc.).
  • Strong working experience with incident management and setting up monitoring alerts.
  • Have a proficient understanding of code versioning tools, such as Git/Bitbucket.
  • Knowledge about building a highly automated production monitoring and support model, hands‑on experience integrating Splunk, Ansible, Dynatrace, Sumologic, Service now ,PagerDuty.com, or equivalents.
  • Proven ability to translate ideas into technical and business realities and map technology to business problems.
  • Experience with private/public cloud services and platforms.
  • Superior verbal and written communication skills with the ability to influence decision‑making with stakeholders.
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
  • Exceptional written and verbal communication skills
  • Excellent problem‑solving skills
  • Flexible approach to work and the ability to adapt to change
  • Prior production support or SRE experience.
  • Proficient with MS suite
  • Nice to have: Experience in building public and internal REST APIs.
  • Nice to have: Experience with CI/CD tools such as Jenkins.
  • Nice to have: Experience working with database technology such as SQL server, Oracle.
  • Nice to have: Experience with the Atlassian tools (JIRA, Confluence).
What's in it for you?
  • You will be part of a diverse and inclusive team of Client‑focused go‑getters looking to learn from each other in an environment that celebrates and recognizes success!
  • You will have access to thousands of online and in person courses so you can shape your career growth with the support from diverse industry leaders.
  • You will get our help to save for your future and to invest in your total wellbeing through our Tangerine benefits*.
  • You belong here, we are equal and un‑complicated. Bring your true self to work, dress codes don’t apply here.
  • You will enjoy workspace flexibility and all the excitement that comes from working at the official Bank of the Toronto Raptors.

*Tangerine employees participate in Scotiabank’s pension & benefits programs (available to permanent employees)

At Tangerine we value the unique skills and experiences each individual brings to the team, and are committed to creating and maintaining an inclusive and accessible environment. If you require accommodation during the recruitment and selection process, please let our Recruitment team know.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.