Toronto
On-site
CAD 70,000 - 100,000
Full time
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
Job summary
A leading company in IT services is seeking a systems monitoring specialist to ensure system health and lead operational investigations. The role requires strong troubleshooting skills and proficiency in SQL, along with excellent communication abilities. You will work closely with various IT partners to implement best practices and improve operational processes.
Qualifications
- Excellent interpersonal and communication skills (verbal and written).
- Deadline-driven and results-oriented.
- Strong troubleshooting skills.
Responsibilities
- Perform system health checks, monitoring, and alerting.
- Lead investigations and analyses of application operational failures.
- Write SQL queries and generate reports.
Skills
Interpersonal skills
Communication skills
Troubleshooting skills
Understanding of programming languages
Knowledge of SQL
Tools
SRE tools (e.g., PagerDuty)
SQL
Job Description:
- Perform system health checks, monitoring, alerting, and take appropriate actions as required.
- Lead investigations and analyses of application operational failures, drive resolution, and follow up for root cause identification.
- Analyze system monitoring dashboards (Dynatrace), logging systems (ELK), and other tools to identify potential issues, conflicts, and risks.
- Ensure the production and maintenance of high-quality support documents and processes.
- Promote innovative and improved working methods and implement best practices.
- Participate in walkthrough reviews of project handovers, specifications, programs, implementation plans, and PIV test plans.
- Write SQL queries against databases and generate reports.
- Establish working relationships across various IT partners, units, and platforms to influence and impact key business partners.
Must-Have:
- Excellent interpersonal and communication skills (verbal and written).
- Deadline-driven and results-oriented; capable of handling multiple tasks while maintaining high quality standards.
- Strong troubleshooting skills.
- Understanding of programming languages, databases, platforms, code management tools, and/or web technologies.
- Experience with SRE tools such as PagerDuty.
- Knowledge of SQL.
Nice-to-Have:
- Experience with architecture and process workflow analysis and design.
- Familiarity with ITSM.