Enable job alerts via email!

TECEZE - IAC Engineer - AI Datacenter Automation

TECEZE CONSULTANCY SERVICES PRIVATE LIMITED

Dubai

On-site

AED 120,000 - 200,000

Full time

15 days ago

Job summary

A leading consultancy in Dubai seeks a highly skilled Infrastructure as Code (IaC) Engineer to spearhead automated provisioning for high-performance AI data centers. The role involves designing scalable infrastructure using modern IaC tools, collaborating closely with AI/ML teams, and ensuring compliance with security standards. Ideal candidates will have extensive experience and certifications in Terraform, Ansible, and Kubernetes.

Qualifications

  • 5+ years of experience in infrastructure automation or SRE roles.
  • Proficiency in Terraform, Ansible, with experience in AI environments.
  • Strong understanding of networking and virtualization platforms.

Responsibilities

  • Design and implement IaC frameworks for AI workloads.
  • Automate deployment of Kubernetes clusters and build CI/CD pipelines.
  • Ensure compliance with security and operational standards.

Skills

Infrastructure Automation
Scripting
Networking
Virtualization
CI/CD

Education

HashiCorp Certified: Terraform Associate
Red Hat Certified Specialist in Ansible Automation
CKA (Certified Kubernetes Administrator)
Cloud certifications (AWS, Azure, GCP)

Tools

Terraform
Ansible
Kubernetes
GitLab CI/CD
Jenkins

Job description

We are seeking a highly capable Infrastructure as Code (IaC) Engineer to lead the design, implementation, and management of automated infrastructure provisioning for high-performance AI data centers.

This role is central to orchestrating compute, network, storage, and virtualization layers using modern IaC tools across on-premises and hybrid cloud environments.

The ideal candidate will play a strategic role in enabling scalable and repeatable deployment pipelines that support GPU clusters, AI model training environments, and containerized platforms such as Kubernetes and Responsibilities :

  • Design and implement IaC frameworks to automate the provisioning and configuration of data center infrastructure for AI workloads.
  • Orchestrate and manage multi-layer automation across compute (GPU / CPU), networking (VXLAN, EVPN, BGP), storage (NVMe, object, parallel file systems), and virtualization platforms (KVM, VMware, OpenShift).
  • Develop reusable Terraform modules, Ansible playbooks, and YAML templates to define infrastructure in version-controlled environments.
  • Automate deployment of Kubernetes clusters and integrate with GPU operators for training and inference pipelines.
  • Build and maintain CI / CD pipelines to deploy, test, and manage infrastructure changes using tools like GitLab CI / CD, Jenkins, or ArgoCD.
  • Integrate with monitoring and observability stacks (Prometheus, Grafana, DCGM) for automated infrastructure validation and health monitoring.
  • Work closely with AI / ML platform teams to align infrastructure deployment with model training, data pipelines, and security policies.
  • Ensure compliance with security and operational standards through policy-as-code and drift detection Skills & Experience :
  • 5+ years of experience in infrastructure automation or SRE roles with hands-on IaC deployment.
  • Proficiency in Terraform, Ansible, and scripting languages such as Python, Bash, and YAML.
  • Experience automating infrastructure in GPU-intensive environments supporting AI / ML workloads.
  • Strong understanding of networking (VXLAN, EVPN, BGP, RoCE) and virtualization platforms (OpenShift, VMware, KVM).
  • Familiarity with Kubernetes, Helm, Operators, and container orchestration frameworks.
  • Exposure to storage automation for AI data lakes (e.g., Ceph, BeeGFS, Lustre, or S3-compatible storage).
  • Experience with CI / CD tools (GitLab CI / CD, Jenkins, ArgoCD, Flux) in IaC Certifications :
  • HashiCorp Certified : Terraform Associate
  • Red Hat Certified Specialist in Ansible Automation
  • CKA (Certified Kubernetes Administrator) or equivalent
  • Cloud certifications (AWS, Azure, or GCP preferred for hybrid orchestration)
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.