Job Search and Career Advice Platform

Enable job alerts via email!

DevOps Engineer (Containerization/Kubernetes, AWS Cloud, AI/ML)

Mindteck

Penang

On-site

MYR 60,000 - 90,000

Full time

4 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm in Malaysia, Penang, is seeking an experienced professional to manage a global AI Training environment. The candidate will allocate compute resources dynamically, oversee AWS systems, and offer MLOps support through tools like MLFlow and KServe. Required qualifications include strong Unix/Linux and Windows Server administration skills, familiarity with containerization and cloud infrastructure, as well as a minimum of 3 years' experience. This role may require participation in night calls due to project timelines.

Qualifications

  • 3-7 years of experience in managing data centers and servers.
  • Familiarity with MLOps platforms and cloud environments.
  • Basic understanding of AI/ML architecture.

Responsibilities

  • Manage a global AI Training environment with dynamic resource allocation.
  • Provide technical support on servers and systems.
  • Administer AWS and on-premise systems with usage tracking.

Skills

Unix/Linux administration
Windows Server admin experience
Containerized environments experience
Scripting experience
Experience with VMs and cloud infrastructure

Tools

MLFlow
KServe
AWS
Docker
Kubernetes
Job description

We are looking for candidates who will help us in design and develop a global AI Training environment that dynamically allocates compute and GPU resources based on model training requirements.

Role
  • Manage the global AI Training environment (Servers/Data center) that dynamically allocates compute and GPU resources based on model training requirements.
  • Provide MLOps platform for model tracking, cataloging, and deployment using tools such as MLFlow and KServe.
  • Develop dashboards to display servers/environment status.
  • Administer AWS cloud and on‑premises systems, including usage tracking and billing through a chargeback model.
  • Provide technical support on servers.
What are the mandatory skills
  • Unix/Linux & Windows Server admin experience in Data Center and Servers.
  • Familiarity with containerized environments, Kubernetes/Docker, and Rancher.
  • Experience with virtual machines (VMs), containerized systems, and cloud infrastructure basics (AWS).
  • Scripting experience in server administration roles.
Good to have
  • Python, SQL, Node.js, and web services design/development.
  • Messaging services such as RabbitMQ and Kafka.
  • Understanding of GenAI, AI/ML solution architecture, and deployment in manufacturing environments.
Level of experience required

3 – 7 years

Working hours

Normal; may need to attend night calls as required by the global project with US team.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.