Enable job alerts via email!

Site Reliability Engineer

HRB

Canada

Remote

CAD 100,000 - 140,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading entertainment company seeks a Senior Site Reliability Engineer to enhance the performance and reliability of its infrastructure. The role involves managing cloud technologies, employing Kubernetes orchestration, and collaborating with teams for continuous improvement while leveraging AI technologies. Experience with Terraform, Oracle EBS, and strong operational knowledge are essential for success in this position.

Qualifications

  • Expert-level knowledge of Terraform for infrastructure automation.
  • Hands-on experience managing Kubernetes clusters.
  • Advanced cloud knowledge covering AWS and Azure.

Responsibilities

  • Develop and maintain automated infrastructure provisioning.
  • Design and manage multi-cloud environments.
  • Support and optimize Oracle EBS deployments.

Skills

Terraform
Cloud Technologies
Kubernetes
Database Administration
Oracle E-Business Suite
Linux (RHEL)
Windows Server
Networking
Logging and Monitoring Tools
Incident Management

Job description

Senior Site Reliability Engineer
Position Overview

We are a mid-size entertainment company delivering captivating digital experiences to millions of customers worldwide. Our IT organization powers the infrastructure and systems behind our cutting-edge payroll and accounting applications. We are seeking a Senior Site Reliability Engineer (SRE) to enhance the performance, scalability, and reliability of our infrastructure and help bring our next-generation solutions to life.

As a Senior Site Reliability Engineer, you will ensure the reliability and scalability of our Infrastructure. You will leverage your skills in cloud technologies, infrastructure operations, Kubernetes orchestration, application development, database administration, Oracle E-Business Suite (EBS), and maintain robust infrastructure that supports business-critical platforms. This role will also involve collaboration with cross-functional teams to implement engineering best practices, monitoring and automation while exploring opportunities to enhance operations with emerging AI technologies.

Key Responsibilities
  • Infrastructure as Code:Develop and maintain automated infrastructure provisioning withTerraformfor hybrid cloud environments.
  • Cloud Expertise:Design and manage robust multi-cloud environments usingAWSandAzure, with a focus on optimizing Kubernetes clusters (EKSandAKS).
  • Oracle E-Business Suite (EBS):Support, optimize, and ensure the reliability ofOracle EBSdeployments, integrating it with other IT systems to maintain smooth business operations.
  • Operating Systems Management:Administer and optimizeLinux (RHEL)andWindows Serverenvironments to ensure high availability and security.
  • Application Performance:Collaborate with development teams to enhance applications built onReact, Node.js, .NET, C#, and Javafor reliability and performance.
  • Networking & Security:Leverageadvanced AWS networking skillsto implement secure and scalable architectures, including VPC design, load balancing, and advanced routing.
  • Database Optimization:Monitor and tune database performance and manage relational and NoSQL databases to support high-traffic entertainment services.
  • Monitoring & Troubleshooting:Implement observability tools and proactively address performance issues using platforms like Prometheus, Grafana, Splunk, or CloudWatch.
  • Incident Response & Automation:Lead incident management, postmortem reviews, and automation efforts to prevent recurrence and improve overall resilience.
  • Cross-Team Collaboration:Work closely with developers, system administrators, and security teams to align infrastructure needs with business and technical goals.
Qualifications
Required Technical Skills
  • Expert-level knowledge ofTerraformfor infrastructure automation.
  • Hands-on experience managingAzure Kubernetes Services (AKS)andAWS Kubernetes Services (EKS)clusters.
  • Advanced knowledge ofAWSandAzurecloud ecosystems, including networking, security, and cost optimization.
  • Proficiency inLinux (RHEL)andWindows Serverenvironments.
  • Proven experience supporting and optimizingOracle E-Business Suite (EBS)in a complex IT environment.
  • Proven application development experience withReact, Node.js, .NET, C#, and Java.
  • Strong database administration and performance-tuning skills for both relational (e.g., MySQL, PostgreSQL, MSSQL) and NoSQL (e.g., DynamoDB, MongoDB) databases.
  • Advanced networking skills, includingVPC design, transit gateways, and hybrid cloud connectivity.
  • Expertise in monitoring, logging, and troubleshooting tools likeNewRelic, Prometheus, Grafana, Splunk, CloudWatch, and others.
Desired Soft Skills
  • Strategic thinking to design scalable and reliable systems for high-demand entertainment platforms.
  • Strong collaboration and mentorship abilities to guide teams in adopting SRE best practices.
  • Excellent communication skills to work with technical and non-technical stakeholders.
  • Adaptability to a fast-paced, dynamic environment.
Nice-to-Have Skills
  • Experience withAI-powered Operations (AIOps)to automate troubleshooting and predictive maintenance.
  • Experience in high-traffic or live-streaming applications.
  • Certifications such as AWS Certified Solutions Architect or Azure Solutions Architect Expert.
  • Familiarity with industry-specific compliance standards, e.g., SOC 2, GDPR.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer - Core C++ Team Canada (remote)

ClickHouse

Remote

CAD 100,000 - 140,000

Yesterday
Be an early applicant

Site Reliability Engineer

Canonical

Waterloo

Remote

CAD 80,000 - 120,000

5 days ago
Be an early applicant

Site Reliability Engineer New Canada-Remote

Onestudyteam

Remote

CAD 90,000 - 130,000

6 days ago
Be an early applicant

Site Reliability Engineer

Canonical

Victoria

Remote

CAD 80,000 - 120,000

20 days ago

Site Reliability Engineer - Data Platform

Kraken Digital Asset Exchange

Remote

CAD 110,000 - 176,000

20 days ago

Senior Site Reliability Engineer

Canonical

Mississauga

Remote

CAD 120,000 - 180,000

20 days ago

Site Reliability Engineer (SRE) AWS

Pragmatike

Ottawa

Remote

CAD 100,000 - 130,000

12 days ago

Site Reliability Engineer

Wave Mobile Money

Remote

USD 60,000 - 153,000

15 days ago

Senior Site Reliability Engineer

Canonical

Calgary

Remote

CAD 90,000 - 130,000

20 days ago