Enable job alerts via email!

Cloud Infrastructure Engineer

INFINITY LINKS PTE. LTD.

Singapore

On-site

SGD 70,000 - 95,000

Full time

Today

Be an early applicant

Job summary

A technology solutions provider in Singapore seeks a skilled Cloud Infrastructure Engineer to design and maintain scalable GPU infrastructure. The ideal candidate should have 3–7 years’ experience in DevOps and proficiency with tools like Kubernetes and Terraform. Responsibilities include automating provisioning, ensuring security, and optimizing performance in multi-tenant environments. Join a dynamic team dedicated to advancing AI applications.

Qualifications

3–7 years of experience in DevOps, Site Reliability, or Infrastructure Engineering roles.
Deep experience managing Linux systems in production environments.
Experience deploying and managing Kubernetes clusters at scale.

Responsibilities

Design, deploy, and maintain scalable cloud infrastructure for GPU workloads.
Automate provisioning of compute resources across bare-metal and cloud environments.
Monitor infrastructure performance, uptime, and system health.

Skills

DevOps

Site Reliability

Infrastructure Engineering

Linux Systems

Kubernetes

Scripting (Bash, Python, Go)

Networking

Tools

Terraform

Ansible

Docker

Prometheus

Grafana

ELK

GitLab CI

ArgoCD

Jenkins

Flux

Overview

IXL Cloud enables businesses, start-ups, researchers, and developers to train, deploy, and scale their AI systems with unmatched performance and flexibility.

We accelerate their AI journey by delivering leading GPU infrastructure, seamless scalability, and AI-first operational support—helping bring advanced AI applications to fruition without the complexity of managing underlying compute architecture.

Responsibilities

As a Cloud Infrastructure Engineer, you will:

Design, deploy, and maintain scalable cloud infrastructure for GPU workloads using tools like Terraform, Ansible, and Kubernetes.
Automate provisioning of compute resources across bare-metal and cloud environments.
Manage container orchestration platforms (Kubernetes, Docker) for multi-tenant GPU cluster environments.
Monitor infrastructure performance, uptime, and system health using observability tools (Prometheus, Grafana, ELK, etc.).
Maintain and optimize storage, networking, and load balancing layers for high-throughput AI workloads.
Implement CI/CD pipelines for both infrastructure and application-level changes.
Collaborate with software engineers, platform teams, and AI researchers to understand workload needs and optimize system performance accordingly.
Ensure infrastructure security, including secrets management, RBAC, and compliance with best practices.
Troubleshoot and resolve infrastructure incidents, scaling issues, and performance bottlenecks.
Support hardware provisioning, firmware updates, and GPU driver/CUDA installations.

Qualifications

3–7 years of experience in DevOps, Site Reliability, or Infrastructure Engineering roles.
Deep experience managing Linux systems in production environments.
Experience deploying and managing Kubernetes clusters at scale (bare metal or cloud-native).
Familiarity with GPU drivers (NVIDIA, CUDA) and workload optimization is a plus.
Proficiency in scripting languages (Bash, Python, Go, etc.).
Strong understanding of networking, firewalls, and storage systems in distributed compute environments.
Experience with CI/CD tools such as GitLab CI, ArgoCD, Jenkins, or Flux.
Excellent communication and documentation skills.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top companies

Popular jobs

Cloud Infrastructure Engineer

INFINITY LINKS PTE. LTD.

Singapore

On-site

SGD 70,000 - 95,000