Enable job alerts via email!

Production and Reliability Management Expert

Compunnel, Inc.

Montreal

On-site

CAD 80,000 - 120,000

Full time

26 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a skilled Production & Reliability Management Expert to join their Cyber Data Risk & Resilience team within the Identity & Access Management domain. You will manage incident responses, drive automation initiatives, and work with advanced cloud technologies to enhance cybersecurity measures. This critical role offers an opportunity to shape the future of cybersecurity in a global financial setting.

Qualifications

  • 4-5+ years of industry experience in software development and production support.
  • Strong Java development experience and proficiency in Python.
  • Experience with web programming and REST/SOAP APIs.

Responsibilities

  • Manage critical production incidents and communicate with stakeholders.
  • Drive automation initiatives and develop operational tools.
  • Collaborate within Agile, Scrum, and SRE frameworks for operational excellence.

Skills

Java
Python
Shell scripting
SQL
Communication skills
Problem-solving

Education

Bachelor’s degree in Computer Science
Software Engineering
Related technical field

Tools

Ansible
GitHub
CI/CD tools

Job description

Production and Reliability Management Expert

05/29/2025

Contract

Active

Job Description:

Job Summary

Client is seeking a skilled Production & Reliability Management Expert to join our Cyber Data Risk & Resilience (CDRR) team within the Identity & Access Management (IAM) domain. In this role, you will act as a key member of a global team responsible for safeguarding the firm through the reliability and operational excellence of IAM control platforms. You’ll be managing incident response, supporting Agile development integration, and driving automation initiatives while working with the latest cloud and data technologies.

This is a unique opportunity to contribute to cybersecurity defense at a global financial leader through cutting-edge technologies and agile development principles.

Key Responsibilities

  • Manage critical production incidents and communicate effectively with key business and technology stakeholders
  • Embed production support principles in Agile/DevOps development cycles to ensure high standards for production readiness
  • Own issue resolution and incident management, including leading incident calls and coordinating cross-functional teams
  • Reduce support costs through automation, optimization, and development of operational tools
  • Analyze technical debt and operational inefficiencies to prioritize remediation and stability improvements
  • Identify, design, and implement automation solutions for business process improvements
  • Develop, test, and deploy automation code; monitor and troubleshoot automation workflows
  • Collaborate with stakeholders to understand requirements and deliver scalable and reliable solutions
  • Work within Agile, Scrum, DevOps, and Site Reliability Engineering (SRE) frameworks to ensure continuous delivery and operational excellence
Required Qualifications
  • Bachelor’s degree in Computer Science, Software Engineering, or a related technical field
  • 4–5+ years of industry experience in software development and production support
  • Strong Java development experience in building multi-threaded, scalable applications
  • Proficiency in Python and Shell scripting
  • Hands-on experience with web programming and developing REST/SOAP APIs
  • Strong SQL skills and familiarity with DB2, Sybase, or Snowflake
  • Experience with automated testing, SDLC pipelines, and automated deployment practices
  • Solid working knowledge of Unix/Linux environments and infrastructure components like load balancing
  • Familiarity with DevOps tools such as Ansible, GitHub, or other CI/CD and release management tools
  • Excellent problem-solving skills and ability to work independently in high-pressure environments
  • Strong interpersonal and communication skills to effectively interact across all organizational levels
Preferred Qualifications (if any)
  • Experience working in financial services or cybersecurity operations
  • Familiarity with IAM platforms and concepts such as user lifecycle, entitlements, and privileged access management
  • Understanding of cloud technologies, infrastructure-as-code, and enterprise monitoring systems
  • Certifications in Agile, DevOps, or SRE methodologies (e.g., SAFe, CKA, SRE Practitioner)
Certifications (if any)
  • Relevant technical certifications (e.g., Java, Python, DevOps, Cloud, or SRE) are a plus but not required.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Expert en gestion de la production / Production and Reliability Management Expert

Procom

Montreal null

On-site

On-site

CAD 80,000 - 120,000

Full time

Today
Be an early applicant

Production and Reliability Management Expert

Compunnel, Inc.

Montreal null

On-site

On-site

CAD 80,000 - 110,000

Full time

Today
Be an early applicant

Production & Reliability Management Expert

Axelon Services Corporation

Montreal null

On-site

On-site

CAD 95,000 - 130,000

Full time

Today
Be an early applicant

Production & Reliability Management Expert

Alltech Consulting Services

Montreal null

On-site

On-site

CAD 80,000 - 110,000

Full time

Today
Be an early applicant