Enable job alerts via email!

Mep/Construction/Infrastructure Engineer

Persol Apac

Tangerang

On-site

IDR 200.000.000 - 300.000.000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Indonesia is seeking a Senior Site Reliability Engineer to design and maintain cloud infrastructure across AWS and Azure. The ideal candidate has 5+ years of experience in Site Reliability or DevOps, a strong background in Kubernetes, and a deep understanding of observability tools. In this role, you'll ensure system stability, automate processes, and collaborate closely with development teams. Competitive salary offered.

Qualifications

5+ years of experience in Site Reliability, DevOps, or Cloud Engineering roles.
Strong hands-on experience with AWS and familiarity with Azure services.
Proven experience deploying and managing containerized applications using Kubernetes and Docker.
Solid understanding of observability tools such as Prometheus and Grafana.
Proficiency in infrastructure-as-code tools like Terraform.
Excellent problem-solving skills with a focus on performance, scalability, and reliability.

Responsibilities

Design, build, and maintain scalable, reliable infrastructure across cloud environments.
Develop and manage CI/CD pipelines for automated deployments.
Operate and monitor Kubernetes clusters for system stability.
Implement observability solutions using Prometheus and Grafana.
Automate infrastructure provisioning and configuration.

Skills

AWS (EC2, VPC, IAM, CloudWatch, Elastic Beanstalk, RDS, S3)

Kubernetes (EKS/AKS)

CI/CD pipeline development

Observability tools (Prometheus, Grafana, Loki, Alertmanager)

Terraform

Scripting (Bash, Python, PowerShell)

Docker

Linux systems

Cloud security best practices

Networking

Tools

Azure DevOps

GitHub Actions

Ansible

Helm

Senior Site Reliability Engineer

Menteng, Jakarta
Paid: IDR20000000 - IDR25000000

Job Description

Design, build, and maintain scalable, reliable, and secure infrastructure across AWS, Azure, and other cloud environments.
Develop and manage CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools for automated deployments.
Operate, monitor, and troubleshoot Kubernetes clusters (EKS, AKS, or self‑managed clusters) to ensure system stability and uptime.
Implement comprehensive observability solutions using Prometheus, Grafana, Loki, and Alertmanager.
Automate infrastructure provisioning and configuration using Terraform, Helm, CloudFormation, and/or Ansible.
Define, measure, and improve system reliability through SLOs, SLIs, and SLAs.
Enhance system resilience and incident response through proactive monitoring, capacity planning, and root‑cause analysis.
Manage secrets, access control, and security policies to maintain a robust and compliant infrastructure.
Participate in on‑call rotations, respond to incidents, and drive post‑incident reviews.
Collaborate closely with development teams to embed reliability and scalability best practices throughout the software lifecycle.
Demonstrate excellent change management, incident & problem response, and service request handling within defined SLA.
Automate processes to reduce manual effort and improve operational efficiency.

Qualifications

5+ years of experience in Site Reliability, DevOps, or Cloud Engineering roles.
Strong hands‑on experience with AWS (EC2, VPC, IAM, CloudWatch, Elastic Beanstalk, RDS, S3) and familiarity with Azure services.
Proven experience deploying and managing containerized applications using Kubernetes (EKS/AKS) and Docker.
Skilled in CI/CD pipeline development and multi‑cloud workflows (Azure DevOps, GitHub Actions, etc.).
Solid understanding of observability tools such as Prometheus, Grafana, Loki, and Alertmanager.
Proficiency in infrastructure‑as‑code tools like Terraform, CloudFormation, or similar.
Scripting skills in Bash, Python, or PowerShell.
Strong grasp of networking, Linux systems, and cloud security best practices.
Excellent problem‑solving skills with a focus on performance, scalability, and reliability.
Experience with ITIL v4, Agile methodology, and financial digital products (e.g., virtual account, QR payments) is a plus.

Technical Skills

Docker, Kubernetes, Terraform, Ansible, CloudFormation, Helm
Prometheus, Grafana, Loki, Alertmanager
CI/CD pipelines, Git, GitHub Actions, Azure DevOps
Bash, Python, PowerShell
AWS, Azure, networking, Linux, cloud security

Non‑Technical Skills

Strong communication and interpersonal skills
Excellent relationship & stakeholder management
Strong motivational and empowerment skills
Commitment, reliability, and organizational leadership
Calm under pressure and initiative take

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.