Enable job alerts via email!

SRE & Automation Engineer

BMO Financial Group

Toronto

On-site

CAD 60,000 - 112,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a skilled Site Reliability Engineer to enhance user experiences through innovative software solutions. In this role, you will lead initiatives in large-scale cloud and on-prem environments, ensuring high availability and performance. Your expertise in OS scripting, CI/CD tools, and cloud platforms will be crucial for optimizing system reliability and fostering collaboration across teams. This position offers an exciting opportunity to contribute to a dynamic IT environment, where your insights will drive strategic decisions and improvements. Join a forward-thinking organization that values diversity and continuous learning.

Benefits

Health insurance
Tuition reimbursement
Retirement plans

Qualifications

  • 7-10 years of experience in site reliability engineering.
  • Strong understanding of IT operational processes and monitoring.

Responsibilities

  • Build and manage platform infrastructure and applications.
  • Monitor system health and optimize performance.
  • Collaborate with teams to troubleshoot infrastructure.

Skills

OS scripting (UNIX, Bash)
CI/CD and automation tools
Source code management (GitHub)
Distributed storage knowledge
Cloud platforms (AWS, Azure)
REST APIs
Proactive problem-solving
Analytical and influence skills
Collaboration and team skills

Education

Master’s degree in computer science

Tools

Ansible
JIRA
Kubernetes
Mesos
Yarn

Job description

At BMO, we’re passionate about building software that solves problems. We count on our site reliability engineers (SREs) to empower users with a rich feature set, high availability, and stellar performance levels to pursue their missions. As we expand customer deployments, we’re seeking an experienced SRE to deliver insights from massive-scale data in real time. We are looking for a candidate who is ready for the Fastlane, eager to learn and contribute, with fresh ideas and a unique viewpoint, who enjoys positive user experiences, and is willing to work and grow in one of the bank's Premium IT Teams. This role involves leading SRE initiatives and implementations in large-scale On-Prem & Cloud environments.

Responsibilities
  1. Build software and systems to manage platform infrastructure and applications.
  2. Improve reliability, quality, and time-to-market of our software solutions.
  3. Measure and optimize system performance, innovate for continual improvement, and anticipate customer needs.
  4. Provide operational support and engineering for large-scale distributed applications.
  5. Monitor system health, availability, and performance.
  6. Develop operational support for full-stack applications.
  7. Analyze metrics from OS and applications for performance tuning and fault detection.
  8. Participate in system design, platform management, and capacity planning.
  9. Create sustainable, automated systems and services.
  10. Balance development speed and reliability with service-level objectives.
  11. Collaborate with operations teams to troubleshoot infrastructure.
  12. Enhance system resilience and serve larger customer volumes with expert coding and change management skills.
  13. Improve automation and self-healing capabilities of systems.
  14. Report performance metrics to stakeholders.
  15. Act as a subject matter expert for stakeholders.
  16. Analyze data to provide insights and strategic recommendations.
  17. Implement changes based on industry trends.
  18. Engage with various areas across the bank for collaboration.
  19. Provide strategic input into business decisions as a trusted advisor.
  20. Understand organizational interactions and complexities.
  21. Stay updated on industry trends through professional development.
  22. Operate enterprise-wide as a resource to senior leaders.
Required Skills and Qualifications
  • Master’s degree in computer science or related field with 7-10 years of experience.
  • Proficiency in OS scripting (UNIX, Bash).
  • Experience with CI/CD and automation tools like Ansible, JIRA.
  • Experience with source code management (GitHub).
  • Knowledge of distributed storage (NFS, HDFS, Ceph, Amazon S3) and resource management frameworks (Kubernetes, Mesos, Yarn).
  • Experience with cloud platforms (AWS, Azure).
  • Experience with REST APIs.
  • Proactive problem-solving skills.
  • Understanding of IT operational processes, monitoring, logging, and alerting standards.
  • Knowledge of support and operational practices.
  • Strong analytical, problem-solving, and influence skills.
  • Excellent collaboration and team skills, with the ability to manage ambiguity.
Preferred Skills and Qualifications
  • Previous success in site reliability engineering roles.
  • Advanced coding experience beyond scripting.
  • Additional responsibilities may be assigned as needed.
Additional Information

Application Deadline: 05/22/2025

Address: 4100 Gordon Baker Road

Salary Range: $60,000.00 - $111,700.00

Pay Type: Salaried

Salaries vary based on location, skills, experience, and education. Benefits include health insurance, tuition reimbursement, and retirement plans. For more details, visit Total Rewards.

About Us

At BMO, we are driven by a shared purpose to create lasting, positive change. We value diversity and inclusion, offering accommodations upon request. Learn more at our careers page.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

SRE & Automation Engineer

Bank of Montreal

Toronto

On-site

CAD 60.000 - 112.000

Yesterday
Be an early applicant

SENIOR SDET (Senior Test Automation Engineer)

COMPASS GROUP CANADA

Mississauga

On-site

CAD 80.000 - 120.000

30+ days ago

Platform Engineer (Kubernetes)

TMX Group

Toronto

On-site

CAD 80.000 - 120.000

22 days ago

Platform Engineer II, Cloud Operations

WeAreTechWomen

Toronto

On-site

CAD 70.000 - 110.000

30+ days ago