Enable job alerts via email!

Mep/Construction/Infrastructure Engineer

Persol Apac

Tangerang

On-site

IDR 200.000.000 - 300.000.000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Indonesia is seeking a Senior Site Reliability Engineer to design and maintain cloud infrastructure across AWS and Azure. The ideal candidate has 5+ years of experience in Site Reliability or DevOps, a strong background in Kubernetes, and a deep understanding of observability tools. In this role, you'll ensure system stability, automate processes, and collaborate closely with development teams. Competitive salary offered.

Qualifications

  • 5+ years of experience in Site Reliability, DevOps, or Cloud Engineering roles.
  • Strong hands-on experience with AWS and familiarity with Azure services.
  • Proven experience deploying and managing containerized applications using Kubernetes and Docker.
  • Solid understanding of observability tools such as Prometheus and Grafana.
  • Proficiency in infrastructure-as-code tools like Terraform.
  • Excellent problem-solving skills with a focus on performance, scalability, and reliability.

Responsibilities

  • Design, build, and maintain scalable, reliable infrastructure across cloud environments.
  • Develop and manage CI/CD pipelines for automated deployments.
  • Operate and monitor Kubernetes clusters for system stability.
  • Implement observability solutions using Prometheus and Grafana.
  • Automate infrastructure provisioning and configuration.

Skills

AWS (EC2, VPC, IAM, CloudWatch, Elastic Beanstalk, RDS, S3)
Kubernetes (EKS/AKS)
CI/CD pipeline development
Observability tools (Prometheus, Grafana, Loki, Alertmanager)
Terraform
Scripting (Bash, Python, PowerShell)
Docker
Linux systems
Cloud security best practices
Networking

Tools

Azure DevOps
GitHub Actions
Ansible
Helm
Job description
Senior Site Reliability Engineer

Menteng, Jakarta
Paid: IDR20000000 - IDR25000000

Job Description
  • Design, build, and maintain scalable, reliable, and secure infrastructure across AWS, Azure, and other cloud environments.
  • Develop and manage CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools for automated deployments.
  • Operate, monitor, and troubleshoot Kubernetes clusters (EKS, AKS, or self‑managed clusters) to ensure system stability and uptime.
  • Implement comprehensive observability solutions using Prometheus, Grafana, Loki, and Alertmanager.
  • Automate infrastructure provisioning and configuration using Terraform, Helm, CloudFormation, and/or Ansible.
  • Define, measure, and improve system reliability through SLOs, SLIs, and SLAs.
  • Enhance system resilience and incident response through proactive monitoring, capacity planning, and root‑cause analysis.
  • Manage secrets, access control, and security policies to maintain a robust and compliant infrastructure.
  • Participate in on‑call rotations, respond to incidents, and drive post‑incident reviews.
  • Collaborate closely with development teams to embed reliability and scalability best practices throughout the software lifecycle.
  • Demonstrate excellent change management, incident & problem response, and service request handling within defined SLA.
  • Automate processes to reduce manual effort and improve operational efficiency.
Qualifications
  • 5+ years of experience in Site Reliability, DevOps, or Cloud Engineering roles.
  • Strong hands‑on experience with AWS (EC2, VPC, IAM, CloudWatch, Elastic Beanstalk, RDS, S3) and familiarity with Azure services.
  • Proven experience deploying and managing containerized applications using Kubernetes (EKS/AKS) and Docker.
  • Skilled in CI/CD pipeline development and multi‑cloud workflows (Azure DevOps, GitHub Actions, etc.).
  • Solid understanding of observability tools such as Prometheus, Grafana, Loki, and Alertmanager.
  • Proficiency in infrastructure‑as‑code tools like Terraform, CloudFormation, or similar.
  • Scripting skills in Bash, Python, or PowerShell.
  • Strong grasp of networking, Linux systems, and cloud security best practices.
  • Excellent problem‑solving skills with a focus on performance, scalability, and reliability.
  • Experience with ITIL v4, Agile methodology, and financial digital products (e.g., virtual account, QR payments) is a plus.
Technical Skills
  • Docker, Kubernetes, Terraform, Ansible, CloudFormation, Helm
  • Prometheus, Grafana, Loki, Alertmanager
  • CI/CD pipelines, Git, GitHub Actions, Azure DevOps
  • Bash, Python, PowerShell
  • AWS, Azure, networking, Linux, cloud security
Non‑Technical Skills
  • Strong communication and interpersonal skills
  • Excellent relationship & stakeholder management
  • Strong motivational and empowerment skills
  • Commitment, reliability, and organizational leadership
  • Calm under pressure and initiative take
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.