Herotel is looking for a Senior Infrastructure Engineer to join our team and take ownership of a hybrid infrastructure estate spanning cloud (primarily Azure) and on-premises systems. The ideal candidate will play a critical role in designing, maintaining, and evolving our infrastructure platforms to support business growth, high availability, and scalability.
This role is ideal for someone who thrives in a dynamic environment, has a strong background in both cloud and on-prem technologies, and can work autonomously on complex problems.
Location: Stellenbosch
Reporting to: Head of Infrastructure
Key Performance Areas would include, but are not limited to:
- Design, implement, and maintain robust hybrid infrastructure solutions across cloud (Azure) and on-prem environments.
- Manage and optimize Kubernetes clusters in Azure Kubernetes Service (AKS).
- Maintain and administer Linux-based systems, ensuring performance, stability, and security.
- Build, manage, and deploy containerized applications using Docker and Kubernetes.
- Manage and support on-premises virtualization infrastructure, particularly Proxmox.
- Ensure high availability, disaster recovery, and scalability across services.
- Monitor infrastructure health and respond to incidents, performing root cause analysis and continuous improvement.
- Contribute to security hardening and compliance efforts.
- Administer and troubleshoot cPanel-based hosting environments, including site migrations, SSL, and performance tuning.
- Manage Domain Hosting and DNS configurations across various registrars and platforms.
Key Outputs:
- Highly Available Infrastructure: Ensure uptime, resilience, and redundancy across all cloud and on-prem systems, while maintaining service-level objectives (SLOs).
- Optimized Kubernetes Environments: Reliable and secure operation of Azure Kubernetes Service (AKS) clusters with efficient scaling, monitoring, and deployment processes in place.
- Stable and Secure Linux Systems: Consistently well-maintained and hardened Linux environments, supporting internal systems and hosted services.
- Efficient Proxmox Operations: On-prem virtualization infrastructure is fully operational, secure, and meets performance expectations.
- Containerization: Delivery of containerized applications using Docker.
- Robust Hosting Platform: Fully managed and monitored cPanel/domain hosting infrastructure with timely resolution of hosting issues and domain management.
- Documentation & Knowledge Sharing: Clear and up-to-date documentation for infrastructure, deployments, and procedures.
- Incident Response & Root Cause Analysis: Prompt response to infrastructure issues with detailed post-mortems and actionable improvements.
- Cross-Team Collaboration: Effective collaboration with development, DevOps, and security teams to ensure infrastructure supports business and product goals.
The successful candidate must have the following experience/skills:
Work Experience and Competencies:
- 5+ years of hands-on experience in infrastructure engineering or systems administration, with at least 2 years in a senior role.
- Strong track record of managing complex hybrid (cloud + on-prem) infrastructure environments.
- Proven experience in a senior infrastructure role with hybrid (cloud + on-prem) environments.
- Strong Linux system administration skills (Ubuntu, CentOS, etc.).
- Expertise with Azure Kubernetes Service (AKS) and container orchestration.
- Strong hands-on experience with Docker – building, managing, and troubleshooting containers in production.
- Hands-on experience with Proxmox in production environments.
- Solid experience with cPanel, WHM, and related web hosting tools.
- Understanding of domain hosting, DNS management, and common registrar platforms.
- Solid understanding of networking, firewalls, DNS, load balancers, and VPNs.
- Experience with scripting (Bash, Python, or PowerShell.
- Familiarity with monitoring and alerting tools (Prometheus, Grafana, Graylog, Zabbix, etc.).
Competencies:
- Demonstrated ability to lead or mentor team members and drive infrastructure initiatives independently.
- Excellent written and verbal communication skills.
- Strong troubleshooting, problem-solving, and analytical skills.
- Excellent communication and collaboration abilities.
Qualifications
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field — or equivalent practical experience.
- Relevant certifications are a strong advantage, such as:
- Microsoft Certified: Azure Administrator Associate / Azure Solutions Architect
- Certified Kubernetes Administrator (CKA)
- Linux Professional Institute Certification (LPIC) or Red Hat Certified Engineer (RHCE)
- Docker Certified Associate
- CompTIA Linux+
If you meet the above requirements, please submit your CV with contactable references.
PLEASE NOTE:
- Preference will be given to Previously Disadvantaged Individual candidates, in line with Herotel’s Employment Equity Plan.
- Submission of your CV provides Herotel with your express consent for us to process your personal information contained therein, for purposes of processing your application.
- Please refer to our Privacy Policy on our website for further information on how we process personal information.
- If you do not hear from us within 3 weeks of your application, please deem your application as unsuccessful.