Enable job alerts via email!

Site Reliability Engineer

Entrust Datacard

Toronto

Hybrid

CAD 70,000 - 110,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Site Reliability Engineer, where you'll ensure the reliability and performance of a cutting-edge SaaS platform. This exciting role involves managing cloud environments, deploying automation strategies, and collaborating with development teams to enhance system security and efficiency. You'll have the opportunity to work in a hybrid environment, balancing in-office collaboration with remote flexibility. If you're passionate about technology and eager to make a significant impact in identity-centric security solutions, this role is perfect for you!

Benefits

Flexible working hours

Collaborative environment

Diversity and inclusion initiatives

Career growth opportunities

Qualifications

5+ years in a related role with extensive experience in microservices.
Hands-on experience with DevOps tools and cloud solutions.

Responsibilities

Monitor system performance using various metrics and tools.
Collaborate with teams to identify and mitigate risks.

Skills

DevOps

Site Reliability Engineering

Cloud Computing

Microservices

Troubleshooting

Incident Management

Automation

Root Cause Analysis

Education

Bachelor’s Degree in Computer Science

Equivalent experience

Tools

Ansible

Terraform

Jenkins

Octopus deploy

Splunk

Prometheus

Grafana

Datadog

Azure

AWS

Career Growth, Flexibility and Collaboration!

Entrust is an innovative leader in identity-centric security solutions, providing an integrated platform of scalable, AI-enabled security offerings. Headquartered in Minnesota, we offer our colleagues the ability to work globally, in a flexible and collaborative environment. Our team makes an impact!!

The Company: Entrust relies on curious, dedicated and innovative individuals whom anticipate the future and provide solutions for a more connected, mobile and secure world. Entrust’s technologies and expertise help government agencies, enterprises and financial institutions in more than 150 countries serve and safeguard citizens, employees and consumers.

We Believe: Securing identities is most effective when we value all identities. We are committed to ensuring that, through diversity and inclusion, the many voices that make up our communities are heard. From unconscious bias training for managers to global affinity groups that create connections both within and across our enterprise, Entrust expects and encourages all individuals to accept and respect one another. And, of course, to be themselves.

Position Overview:The Instant Financial Issuance (IFI) Cloud Service includes a wide array of components including web services, application servers, and databases hosted in a Hybrid cloud environment. The Site Reliability Engineer (SRE) will be responsible for ensuring that the SaaS platform is reliable, available, and performant, as well as scalable, secure, and cost-effective. Ultimately, the individual will be responsible for the functional management of all the IFIaaS cloud environments, applications, networks, scoping projects, and the resolution of application and network issues.

Responsibilities:

Monitor system issues using various metrics, such as uptime, latency, error rate, throughput, and availability
Deploy and maintain monitoring and on-call tools i.e.: Splunk, Prometheus, Grafana, PagerDuty, Datadog, etc.
Create strategies to detect issues, such as setting up alerts, dashboards, and health checks
Address issues as they arise, using troubleshooting techniques, root cause analysis, and incident management.
Design systems to troubleshoot automatically, using self-healing mechanisms, such as auto-scaling, load balancing, and failover, mitigation run books
Collaborate with development teams and other stakeholders to identify potential risks, such as security vulnerabilities, performance bottlenecks, deployment issues, or configuration errors
Implement various risk mitigation strategies, such as patching, backup, redundancy, encryption, or testing
Design, build and maintain robust infrastructure built on Azure and AWS, leveraging native cloud technologies i.e. AKS, EKS, managed SQL, Mongo, etc.
Define and follow a clear incident response process, which includes roles, responsibilities, escalation, communication, and resolution
Use automation and orchestration tools to speed up the recovery process, such as restoring backups, rolling back changes, or deploying fixes
Design, implement and maintain robust CI/CD pipelines to automate software delivery process
Automate configuration management tasks across multiple servers in Hybrid cloud environments using tools like Ansible, Terraform, etc.
Define IaC to provision and manage cloud resources in Hybrid environments (Azure, AWS, On-Prem) including complete lifecycle management scaling and decommissioning.
Implement best practices and standards to prevent or reduce the occurrence of emergencies, such as code reviews, testing, and monitoring.
Implement and support a hybrid cloud environment in Microsoft Azure and on-premise
Update incident response run Books, automation and create new templates as required
Manage activities with complete integrity and in accordance with the organization's policies, systems, practices, and programs
Collaborate with product teams and other teams to understand the user needs, expectations, and satisfaction.
Learn from incidents and post-mortems and implement the action items to prevent recurrence or improve response.
Suggest and implement new solutions and technologies to enhance the system and the service, such as optimization, automation, or innovation.
Provide after-hours support for production issues on rotational basis with other team members to ensure system availability 24/7/365.

Basic Qualifications:

Bachelor’s Degree in Computer Science, Software Engineering, or equivalent combination of education and experience
5+ years of related experience as a Software Engineer, DevOps Engineer, Site Reliability Engineer or a role in similar capacity
Extensive experience working with enterprise level micro-services applications, including deployment and maintenance of the applications in distributed environments.
Demonstrated hands-on experience and expertise with DevOps tooling (Ansible, Terraform, Jenkins, Octopus deploy, etc.) networks, network security, high-level managerial skills
In-Depth hands-on experience with on-prem and cloud compute, storage and networking solutions (vmWare, NetApp, Azure, AWS, etc)

Where you will be: This role is hybrid, requiring three days a week in-office at our offices in Ottawa, Canada or Denver, CO, as specified in the job description. At Entrust, we have a distributed workforce.

About Entrust:

Entrust keeps the world moving safely by enabling trusted identities, payments and data protection around the globe. Today more than ever, people demand seamless, secure experiences, whether they’re crossing borders, making a purchase, or accessing corporate networks. With our unmatched breadth of digital security and credential issuance solutions, it’s no wonder the world’s most entrusted organizations trust us.

For more information, visit www.entrust.com. Follow us on LinkedIn, Facebook, Instagram, and YouTube.

Entrust Corporation is an EOE/AA/Veteran/People with Disabilities employer.

Updated 9/14/2020

NO AGENCIES, NO RELOCATION

#LI-GR1

#ENT123

For US roles, or where applicable:

Entrust is an EEO/AA/Disabled/Veterans Employer

For Canadian roles, or where applicable:

Entrust values diversity and inclusion and we are committed to building a diverse workforce with wide perspectives and innovative ideas. We welcome applications from qualified individuals of all backgrounds, and we strive to provide an accessible experience for candidates of all abilities.

If you require an accommodation, contact accessibility@entrust.com.

Recruiter:

Grace Rusingiza Grace.Rusingiza@entrust.com

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

On-site

CAD 90,000 - 130,000

Full time

Today

Be an early applicant

Site Reliability Engineer

Entrust Datacard

Toronto

Hybrid

CAD 70,000 - 110,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Job description

Similar jobs

Senior Turbine Reliability Engineer

Toronto null

Remote

Remote

CAD 80,000 - 110,000

Full time

Senior System Safety Engineer

Toronto null

Remote

Remote

CAD 80,000 - 120,000

Full time

Senior System Safety Engineer

Toronto null

Remote

Remote

CAD 90,000 - 120,000

Full time

Remote - Principal Site Reliability Engineer

null null

Remote

Remote

CAD 83,000 - 150,000

Full time

Sr Site Reliability Engineer

Toronto null

On-site

On-site

CAD 90,000 - 120,000

Full time