Production and Reliability Management Expert
As a part of its agreements with its various clients, Procom is currently seeking a Production and Reliability Management Expert for a company in the investment sector. Our client is located in Montréal.
Job details – Production and Reliability Management Expert
Key responsibilities for this position include :
- Managing critical incidents and ensuring all key management and business stakeholders are kept up to date;
- Ensure Production Management is closely aligned / embedded in the Agile software development process and our code meets production standards;
- Excellent communication and interpersonal skills with a professional ownership of issues;
- Reduction of the cost of support (hours of effort) through the elimination of operational issues, optimization and automation of tasks, development of operational tools and driving client self-service to minimize constraints;
- Ability to manage an incident call and coordinate multiple teams towards a common goal of resolving a business impactful outage;
- Identification and prioritization of technical debt that risks instability or creates wasteful operational toil;
- Analyse business processes and identity automation opportunity;
- Develop, Test and deploy automations;
- Integrate automation solutions with existing systems and infrastructure;
- Monitor and troubleshoot automation issue;
- Collaborate with stakeholders to understand requirements and deliver solutions;
- Comfortable with DevOps, Agile, Scrum and SRE principles.
Mandatory Skills – Production and Reliability Management Expert
- Bachelors degree in Computer Science, Software Engineering or related field;
- Minimum of four to five years industry experience in Software development;
- Strong Development skills in Java building medium to large scale multi-threaded applications;
- Knowledge of scripting in Python and Shell Scripting;
- Experience with Web Programming and REST / SOAP services (API);
- Database experience in SQL queries and DB2 , Sybase, or Snowflake database & DB reporting;
- Experience on creating automation test suits, SDLC and automated deployments;
- System knowledge in Unix / Linux , and Infrastructure set up such as Load balancing;
- Experience with Ansible, GitHub or similar configuration / release management tools.