Job Search and Career Advice Platform

Enable job alerts via email!

Cloud Operations Engineer (Platform Reliability / NOC)

Agensi Pekerjaan Genie Hunt Talent

Petaling Jaya

On-site

MYR 60,000 - 80,000

Full time

27 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A recruitment agency is seeking a Cloud Operations Engineer to ensure uptime and performance of client-facing applications. The role involves monitoring and maintaining cloud infrastructure, scripting, and collaborating with teams. Candidates should have solid Linux skills and experience with Kubernetes and Docker. Good communication skills are a plus, especially in Mandarin.

Qualifications

  • Solid Linux system administration skills (Ubuntu, CentOS or similar).
  • Experience troubleshooting application and infrastructure issues.
  • Comfortable working in a rotational shift environment.

Responsibilities

  • Monitor and maintain the performance and availability of applications.
  • Deploy and configure new services on Kubernetes or cloud instances.
  • Troubleshoot issues across servers, databases, and networks.

Skills

Linux system administration
Troubleshooting application issues
Kubernetes
Docker
Scripting/automation in Bash
Database performance tuning
Good communication

Tools

Kubernetes
Docker
MySQL
PostgreSQL
Job description

We are seeking an experienced Cloud Operations Engineer. You'll play a key role in keeping our client-facing applications, APIs and cloud infrastructure running smoothly ensuring uptime, performance and reliability across multiple environments.

This role suits someone who loves problem-solving, enjoys working with Linux and modern cloud tools, and wants to grow in DevOps / Site Reliability Engineering.

Key Responsibilities:

  • Monitor & Maintain the performance and availability of our cloud-hosted applications and infrastructure.
  • Deploy & Configure new services (on Kubernetes, virtual machines or cloud instances) following best practices.
  • Troubleshoot issues across servers, databases, networks and deployment pipelines identify root causes and resolve them quickly.
  • Automate routine checks and maintenance tasks using Bash or scripting tools.
  • Collaborate with developers, DevOps engineers and data teams to ensure smooth releases and stable environments.
  • Continuously improve our monitoring systems, alerting processes and incident response playbooks.
  • Participate in on-call rotation and respond to incidents according to defined SLAs.

Qualifications:

  • Solid Linux system administration skills (Ubuntu, CentOS or similar).
  • Experience troubleshooting application and infrastructure issues network connectivity, database performance, deployments.
  • Familiar with Kubernetes, Docker or other container platforms.
  • Understanding of databases (MySQL, PostgreSQL, etc.) and performance tuning.
  • Ability to script/automate in Bash, Shell, or Python.
  • Comfortable working in a rotational shift environment
  • Good communication skills; Mandarin ability is a plus
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.