This role is conducted within the vision, mission, and strategic plan of the Insurance Authority. Reporting to the Manager of Operations & Monitoring, this role will safeguard the Authority’s IT infrastructure by orchestrating round-the-clock monitoring, first-line incident containment, and data-driven performance oversight. The Senior Specialist will translate real-time telemetry into actionable insights, ensuring robust service continuity, regulatory compliance, and adherence to ISO 20000/27001 controls. By coordinating resolver teams, external vendors, and internal stakeholders, the position drives swift escalation, root-cause remediation, and transparent KPI reporting that informs divisional strategy and audit readiness. The position also champions continuous optimization, automates repetitive tasks, and mentors junior technicians to embed a culture of operational excellence and resilient service delivery standards.
Responsibilities and Tasks:
- Continuously monitor infrastructure dashboards, event logs, and automated alerts to detect anomalies across compute, storage, network, database, and application layers. Execute scheduled health‑checks and capacity threshold reviews, escalating deviations in line with Operations & Monitoring Standard Operating Procedures.
- Maintain and fine‑tune alert rules, polling intervals, and thresholds to maximize coverage while reducing false positives.
- Coordinate first‑line incident response: perform technical triage, open ITSM tickets, and upgrade to L2/L3 resolver groups within agreed service levels.
- Lead post‑incident reviews by collecting evidence, contributing to root‑cause analysis, and recommending permanent fixes for recurring issues.
- Track open incidents and problems through closure, ensuring SLA adherence and providing timely status updates to the Manager of Operations & Monitoring.
- Compile daily, weekly, and monthly operations dashboards, KPI scorecards, and trend analysis for divisional leadership and audit purposes.
- Create and maintain runbooks, knowledge‑base articles, and step‑by‑step procedures to support consistent execution and rapid knowledge transfer.
- Identify recurring alerts, performance bottlenecks, or manual tasks and propose optimization or automation opportunities to enhance service resilience.
- Support the deployment, configuration, and acceptance testing of new monitoring tools, agents, and scripts in accordance with change‑management policy.
- Ensure operational activities comply with internal ITSM governance, cybersecurity controls, and external standards such as ISO 20000 and ISO 27001.
- Liaise with application owners, infrastructure vendors, and the Service Desk to coordinate maintenance windows, patching schedules, and change activities with minimal business disruption.
- Coach and mentor junior technicians by sharing best practices, reviewing ticket quality, and fostering a culture of continual service improvement.
- Perform other job duties as assigned.
Job requirements:
Educational Qualifications (required)
- Bachelor’s degree in computer science, Software Engineering, Computer Engineering, Information Technology, Information Systems, Data Science, or a related field, or related field
Certifications (required)
- Relevant Professional Certificate is preferred
Experience
- 2+ years with bachelor’s degree, Position relevant experience is required.
Language
(A1-A2: Basic, B1-B2: Intermediate, C1-C2: Fluent)
- English (C1), Arabic (C2)
Core competencies
- Ethics & Integrity (Beginner)
- Effective Communication (Beginner)
- Collaboration & Horizontality (Beginner)
- Personal Competence (Beginner)
- Analysis and Problem Solving (Intermediate)
- Network Infrastructure Management (Beginner)
- Backup & Disaster Recovery (Beginner)
- End User Device & Asset Management (Beginner)
- Service Desk & Incident Management (Intermediate)