Enable job alerts via email!

AI Infrastructure Engineer – GRaduate Industry Traineeships (GRIT) Programme

RAYDIAN CLOUD PTE. LTD.

Singapore

On-site

SGD 80,000 - 100,000

Full time

Today
Be an early applicant

Job summary

A technology company in Singapore is seeking an AI Infrastructure Engineer to design and manage the physical infrastructure for advanced machine learning workloads. The ideal candidate will have hands-on experience with servers and compute clusters, expertise in remote management tools, and strong troubleshooting skills. This role offers competitive compensation and the opportunity to work with cutting-edge hardware and technologies.

Benefits

Competitive compensation
Flexible work culture
Rapid growth opportunities

Qualifications

  • Hands-on experience with servers, high-performance storage, and compute clusters.
  • Experience with remote management tools (IPMI, Redfish, BMC) and monitoring platforms.
  • Excellent documentation and troubleshooting skills.

Responsibilities

  • Design and deploy physical infrastructure for AI workloads including GPU servers.
  • Manage data center operations including rack layout and power distribution.
  • Monitor hardware performance across distributed environments.
  • Coordinate with vendors for hardware procurement and support.
  • Document infrastructure designs and operational procedures.

Skills

Experience with servers
High-performance storage
Compute clusters
Remote management tools (IPMI, Redfish, BMC)
Excellent documentation skills
Troubleshooting skills
Job description
Overview

About the Role: Raydian Cloud is building the physical backbone of enterprise AI. As an AI Infrastructure Engineer focused on physical systems, you’ll be responsible for architecting, deploying, and maintaining the high-performance compute environments that power advanced machine learning workloads. From GPU clusters and high-speed networking to storage and cooling systems, you’ll ensure our infrastructure is optimized for scale, reliability, and performance.

Responsibilities
  • Design and deploy physical infrastructure for traditional as well as AI workloads, including GPU servers, high-density compute nodes, and NVMe-based storage arrays
  • Manage data center operations, including rack layout, cabling, power distribution, and cooling optimization
  • Monitor hardware performance, utilization, and health across distributed environments
  • Coordinate with vendors and OEMs for hardware procurement, installation, and support
  • Document infrastructure designs, deployment procedures, and operational runbooks
Required Skills & Qualifications
  • Hands-on experience with servers, high-performance storage, and compute clusters
  • Experience with remote management tools (IPMI, Redfish, BMC) and monitoring platforms
  • Excellent documentation and troubleshooting skills
Nice to Have
  • Experience with liquid cooling or immersion cooling systems for high-density compute
  • Familiarity with edge AI deployments and hybrid infrastructure models
  • Certifications in data center operations, hardware engineering, or AI infrastructure
Why Join Raydian Cloud?
  • Help build the physical foundation of enterprise AI transformation
  • Work with cutting-edge hardware and infrastructure technologies
  • Influence infrastructure strategy and deployment standards
  • Competitive compensation, flexible work culture, and rapid growth opportunities
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.