Enable job alerts via email!

Senior Site Reliability Engineer (SRE)

Devopshunt

Montreal

On-site

CAD 100,000 - 150,000

Full time

30+ days ago

Job summary

A leading company in the tech field is seeking a Senior Site Reliability Engineer to enhance their AWS infrastructure. This role focuses on building reliable platforms, automating operations, and ensuring security compliance while working within a dynamic cross-functional team.

Qualifications

  • Experience with hybrid cloud infrastructures and Infrastructure as Code.
  • Proficiency in managing Kubernetes and optimizing cloud services.
  • Familiarity with observability tools and security standards.

Responsibilities

  • Architect scalable hybrid cloud infrastructure using IaC tools.
  • Manage and optimize Kubernetes clusters for high availability.
  • Ensure compliance with security policies and standards.

Skills

Kubernetes management
Terraform
Observability
Automation
CI/CD pipelines
Security compliance
Job description

We are seeking a Senior Site Reliability Engineer (SRE) to join our team and play a key role in ensuring the reliability, scalability, and security of our hybrid AWS infrastructure. Reporting to the Digital Infrastructure Team Lead , you will collaborate with cross-functional teams to design and optimize cloud platforms, streamline developer workflows, and drive automation and observability practices. This is an opportunity to make a significant impact in a fast-paced, innovative environment.

If you’re passionate about building reliable, scalable, and secure platforms while empowering developers, apply today to join our dynamic team!

Key Responsibilities :

  • Architect scalable, fault-tolerant hybrid cloud infrastructure using Infrastructure as Code (IaC) tools (e.g., Terraform, OpenTofu).
  • Build reusable templates and optimize multi-region systems for disaster recovery (RTO

Platform Operations

  • Manage Kubernetes clusters (e.g., EKS, GKE) and ensure seamless hybrid cloud integration.
  • Perform updates, scaling, and maintenance to maintain high availability and SLA compliance.
  • Build and optimize CI / CD pipelines using tools like Jenkins, GitLab, and GitHub Actions.
  • Develop self-service tools that enable developers to provision environments and deploy code independently.

Observability & Security

  • Deploy modern observability solutions (e.g., Prometheus, Grafana, OpenTelemetry) to achieve 100% platform visibility.
  • Automate security policies in CI / CD pipelines and enforce compliance with standards like SOC 2 and ISO 27001.

Automation & Optimization

  • Automate repetitive tasks to improve operational efficiency and reduce manual effort.
  • Optimize cloud resource utilization to enhance performance while reducing costs.
  • Collaborate with developers, cloud engineers, and security teams to align infrastructure goals with business objectives.
  • Advocate for SRE best practices, fostering a culture of automation, observability, and scalability.

J-18808-Ljbffr

Create a job alert for this search

Site Reliability Engineer • Montreal, Montreal (administrative region), Canada

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.