Enable job alerts via email!

Site Cme Engineer

Pt Dian Graha Elektrika

Jawa Barat

On-site

IDR 120.000.000 - 360.000.000

Full time

Today
Be an early applicant

Job summary

A leading tech company in Indonesia is seeking a Senior Site Reliability Engineer to design and maintain cloud infrastructure on AWS and Azure. The ideal candidate will have over 5 years of experience in SRE or similar roles, with skills in Kubernetes and CI/CD. Join us to enhance system reliability and oversee critical cloud operations, while working in a collaborative environment focused on excellence.

Qualifications

  • Proven experience as SRE / IT Support / Application Support / System Engineer with at least 5 years’ experience.
  • Has CKA (Certified Kubernetes Administrator) would be a plus.

Responsibilities

  • Design, build, and maintain scalable infrastructure across AWS and Azure.
  • Develop and manage CI/CD pipelines for automated deployments.
  • Operate and troubleshoot Kubernetes clusters to ensure system uptime.
  • Implement observability solutions using Prometheus, Grafana, etc.

Skills

Scripting (bash/python)
Strong relationship management
Excellent communication and interpersonal skills
Proactive monitoring
CI/CD

Tools

Docker
Terraform
Kubernetes
Azure DevOps
Job description
Senior Site Reliability Engineer

Menteng, Jakarta – Salary IDR120,000,000 to IDR360,000,000 – PT ALTO Network

Menteng, Jakarta – Salary IDR20,000,000 to IDR25,000,000 – BookCabin

Posted today

Job Description
  • Demonstrate excellent change management in implementing changes safely and efficiently in the production environment.
  • Demonstrate excellent incident & problem response and resolve the issue within SLA.
  • Demonstrate excellent service request handling from other parties within SLA.
  • Demonstrate excellent efficiency in automating tasks and reducing manual effort.
  • Demonstrate excellent implementation of a comprehensive monitoring system to detect issues early and proactively react.
  • Demonstrate excellent curiosity to find out and review root cause analysis.
  • Demonstrate excellent reviewing system performance and come up with an action plan.
  • Demonstrate excellent problem‑solving and come up with an action plan.
  • Demonstrate excellent reviewing change activity in production.
  • Responsible on handling incident & problem resolution.
  • Enable automation processes on each product.
  • Understand customer (internal & external) needs and deliver the expected outcomes.
  • Execute of plans and strategies.
  • Faster a customer‑focused working environment with clear responsibilities and expectation.
  • Creating and execution deployment strategy.
  • Establish and maintain active and constructive relationships with other team in the organization (internal).
  • Risk/Findings audit to be fulfilled.
  • Perform support good corporate governance in their specific areas of work.
Responsibilities
  • Design, build, and maintain scalable, reliable, and secure infrastructure across AWS (including Elastic Beanstalk) and Azure.
  • Develop and manage CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools to ensure smooth and automated deployments.
  • Operate, monitor, and troubleshoot Kubernetes clusters (EKS, AKS, or self‑managed) to ensure system stability and uptime.
  • Implement comprehensive observability solutions using Prometheus, Grafana, Loki, and Alertmanager.
  • Automate infrastructure provisioning and configuration using Terraform, Helm, CloudFormation, and/or Ansible.
  • Define, measure, and improve system reliability through SLOs, SLIs, and SLAs.
  • Enhance system resilience and incident response through proactive monitoring and capacity planning.
  • Manage secrets, access control, and security policies to maintain a robust and compliant infrastructure.
  • Participate in on‑call rotations, respond to incidents, and drive root cause analysis and post‑incident reviews.
  • Collaborate closely with development teams to embed reliability and scalability best practices throughout the software lifecycle.
Qualifications
  • Proven experience as SRE / IT Support / Application Support / System Engineer or similar position at least 5 years’ experience.
  • Has CKA (Certified Kubernetes Administrator) would be plus.
Knowledge
  • ISO 8583
  • RestAPI
  • Networking
  • Postman / API testing
  • ITIL v4 / IT Service Management
  • Agile Methodology
  • Financial Digital Product (Biller, disbursement, virtual account, QR)
Non-Technical
  • Reporting and emergency response planning
  • Strong relationship management
  • Excellent communication and interpersonal skills
  • Strong motivational and empowerment skills
  • Commitment and reliable
  • Outstanding organizational and leadership skills
  • Take initiative and remain calm under pressure
Technical
  • Docker
  • Windows
  • SQL Query
  • CI/CD
  • Scripting (bash/python)
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.