Enable job alerts via email!
Boost your interview chances
A leading company in Singapore seeks a highly skilled IT Operations Specialist, focusing on Red Hat Enterprise Linux (RHEL) systems. This role involves managing critical incidents, optimizing system performance, and ensuring security compliance in a fast-paced environment. The ideal candidate will have over a decade of experience in RHEL, with a strong emphasis on cross-functional collaboration and a thorough understanding of cloud and hybrid infrastructure management.
Job Purpose:
· Provide technical support for the in-scope technology domains (Patch Management, Vulnerability Management and DCS Operations Support) for Red Hat Enterprise Linux
· Work with DCS Operations service providers, product principal vendors, Branch IT support and external vendors
Job Responsibilities:
1. Incident & Problem Resolution
- Act as the highest escalation point for complex Linux (RHEL) server incidents.
- Lead root cause analysis (RCA) for critical outages and implement preventive measures.
- Mentor L1/L2 teams in troubleshooting and best practices."
2. System Administration & Optimization
- Manage large-scale Linux environments (on-prem/cloud), including performance tuning, kernel patching, and security hardening.
- Design and implement high-availability (HA) and disaster recovery (DR) solutions (clustering, failover, backups).
- Automate repetitive tasks using Bash/Python/Ansible for configuration management."
3. Security & Compliance
- Enforce security baselines (CIS, STIG) and remediate vulnerabilities (CVEs).
- Collaborate with Cyber Security teams to implement audit logging, SELinux, firewalls (iptables/nftables), and encryption.
- Ensure compliance with security baselines for Linux infrastructure."
4. Change & Patch Management
- Plan and execute OS upgrades, patches, and migrations with minimal downtime.
- Review and approve change requests for Linux systems in compliance with ITIL.
- Maintain patch compliance using tools like Red Hat Satellite."
5. Cloud & Hybrid Infrastructure
- Deploy and manage Linux workloads in on-premises & Azure (Azure VMs).
- Troubleshoot hybrid-environment issues (e.g., authentication, DNS, latency)."
6. Monitoring & Performance
- Configure and optimize monitoring tools (e.g. SolarWinds, PRTG).
- Analyse system metrics to proactive capacity planning and resource allocation."
7. Collaboration & Documentation
- Work with cross-IT domains (e.g: Database, Network and Application teams to resolve cross-functional issues.
- Document SOPs, RCA reports, and DR runbooks for Linux systems."
8. On-Call & Operational Support
- Participate in 24/7 on-call rotations for critical incident response.
- Lead DR drills and ensure Linux systems meet RTO/RPO targets."
Job Requirements:
1. 10+ years of hands-on experience in Red Hat Enterprise Linux (RHEL) operations, with at least 5 years in mission-critical environments (24/7 uptime, 500+ servers).
2. Willingness to participate in rotational on-call support (including nights/weekends) for critical incident response and maintenance activities.
3. Strong understanding of Service Level Agreements (SLAs), Recovery Time Objectives (RTO), and Recovery Point Objectives (RPO) in an enterprise IT environment.
4. Hands-on experience with infrastructure monitoring tools (e.g., SolarWinds, PRTG) for performance tracking and proactive issue resolution.
5. Proven ability to work in multicultural, geographically dispersed teams (APAC experience preferred).
6. Intermediate-level knowledge of ITSM processes (Incident, Change, Problem, and Service Request Management) and experience with ITSM tools (e.g., ServiceNow)
7. Must be able to work independently and in collaboration with various stakeholders as part of an APAC team e.g. DCS-related planning and implementation, security teams, service management teams, application teams, country IT teams and others
8. Fluent business English (written and spoken) for technical documentation, stakeholder communication, and incident reporting.
9. Deep expertise in RHEL 7/8/9 (installation, hardening, tuning, troubleshooting), including kernel patching and disaster recovery.
10. Adept in virtualization knowledge (VMware vSphere, Nutanix AHV) for resource optimization, clustering, and high-availability configurations.
11. Working knowledge of cloud platforms (Azure/AWS) for hybrid infrastructure management (IaaS)
12. Strong analytical and problem-solving skills, with the ability to diagnose and resolve complex technical issues under pressure.
13. Industry certifications (e.g., RHCE, RHCSA, or equivalent) are highly desirable.
14. Hands-on experience with SELinux, CIS benchmarks, vulnerability remediation, and audit logging (e.g., Auditd).