Key Responsibilities
1. Incident Management & Troubleshooting
- Provide onsite technical support for all hardware, system, and application issues under maintenance scope.
- Assist in incident verification, isolation, and resolution or provision of approved workarounds.
- Troubleshoot and resolve Level 1 and Level 2 technical issues.
- Escalate complex or unresolved incidents to Maintenance Team Leader and inform Maintenance Manager accordingly.
- Respond promptly to service disruptions, system alarms, or performance anomalies.
2. Preventive & Corrective Maintenance
- Perform daily system health checks and review logs to identify early signs of issues.
- Execute preventive maintenance activities and carry out corrective actions when required.
- Perform and verify scheduled backups: daily incremental, hot backups, and weekly full backups.
- Execute system recovery procedures during service restoration.
3. System Administration
- Add, remove, or update user account information in coordination with system owners.
- Reset passwords and manage access controls securely.
- Monitor system performance and tune systems or databases based on advisory or logs.
4. Patch Management & Upgrades
- Test and deploy OS patches, firmware upgrades, and software updates.
- Perform staging and implementation of hardware upgrades and COTS software patches.
- Ensure all changes adhere to change control and system hardening policies.
5. Hardware & Infrastructure Maintenance
- Support, troubleshoot, and maintain hardware devices including:
- Servers (e.g., Dell PowerEdge R750)
- Firewalls (e.g., FortiGate 1101E)
- Storage Devices (e.g., Dell EMC XT380/XT480)
- Switches (e.g., Cisco C9300)
- UPS & Power Management (e.g., APC Smart UPS, Rack PDU)
- KVM consoles, HSMs, NTP servers, and mobile computing devices.
6. Software Platform Support
- Manage and monitor platforms and applications such as:
- ArcGIS Server, IBM ACE + MQ, Kafka, MongoDB, MS SQL, WebSphere, Elastic Stack, Rocket.Chat
- Security & Endpoint Tools: Symantec, Carbon Black EDR, CipherTrust, Fortify WebInspect, Keycloak.
- Monitoring & DevOps Tools: Grafana, Prometheus, GitLab Enterprise, Ansible, OpenShift, Red Hat Satellite.
7. Documentation & Reporting
- Maintain and update all relevant documentation, including SOPs, maintenance records, system diagrams, and logs.
- Generate reports on system performance, incident handling, and preventive maintenance activities.
8. Advisory & Continuous Improvement
- Provide technical advice on infrastructure improvement, system performance tuning, and reliability enhancements.
- Propose and implement automation where applicable to streamline system monitoring and recovery processes.
Requirements
Essential Qualifications & Experience
- Diploma or Degree in Computer Science, Information Systems, or equivalent.
- Minimum 3 years of experience in IT system administration or infrastructure maintenance roles.
- Strong knowledge of Linux (RHEL) and Windows Server (2019) environments.
- Experience with backup systems (e.g., Dell EMC Data Domain, Avamar), firewalls, and enterprise‑grade hardware.
- Familiar with container platforms (e.g., OpenShift), middleware (IBM ACE, WebSphere), databases (SQL, MongoDB), and cloud or hybrid integration platforms.
- Working knowledge of DevOps tools (Ansible, GitLab, SonarQube).
- Familiarity with security technologies (Keycloak, CipherTrust, FortiGate, WebInspect).
- Knowledge of ITIL processes for incident, change, and problem management.