The Platform Operations Engineer works with government agencies to maintain and support critical on‑premises platforms and infrastructure. Focus areas include platform reliability, modern operational practices, and infrastructure modernisation, while maintaining strong support for existing systems.
Key Responsibilities
Maintain critical infrastructure platforms: compute, storage, virtualisation, and supporting systems across development, staging, and production environments
- Manage virtualisation platforms (VMware, Hyper‑V), including:
Capacity monitoring
Performance optimisation
Lifecycle management
Operational Standards & Automation
- Follow and implement platform standards.
- Execute infrastructure automation and modern operational practices for improved efficiency and reliability.
- Execute platform patching strategies using automation to maintain security, stability, and minimise service disruption.
Enhancement & Modernisation
- Support platform enhancement initiatives.
- Implement new infrastructure solutions aligned with enterprise architecture standards.
- Support containerisation initiatives and maintain container orchestration platforms.
Monitoring & Observability
- Implement and maintain monitoring and observability solutions using:
Prometheus
Grafana
ELK stack
Incident Resolution
- Provide L2/L3 technical support for platform‑related incidents.
- Perform problem determination and resolution.
Infrastructure as Code (IaC)
- Implement IaC practices for automated platform provisioning and configuration management.
Backup, DR & High Availability
- Maintain backup, disaster recovery, and high‑availability solutions for critical platform components.
Security & Compliance
- Follow security controls:
Access management
Security hardening
Compliance monitoring
Collaboration & Documentation
- Collaborate with application teams for stability, performance, and scalability.
- Create & maintain platform documentation, runbooks, and SOPs.
- Support team members in modern infrastructure practices.
Technical Requirements
- Strong experience in enterprise virtualisation (VMware vSphere, Hyper‑V)
- Experience with SAN / NAS storage systems
- Experience with enterprise backup solutions
- Proficient in Linux & Windows Server administration
- Experience with infrastructure automation tools:
- Ansible
- Puppet
- Chef
- Knowledge of container technologies:
- Docker
- Kubernetes
- Familiar with monitoring/observability platforms
- Experience with Infrastructure as Code (IaC) practices
- Understanding of networking concepts & technologies
- Scripting abilities:
- Python
- PowerShell Bash
- Bash
Desired Certifications
- VMware Certified Professional (VCP)
- Microsoft Certified: Windows Server
- Red Hat Certified Engineer (RHCE)
- ITIL v4 Foundation