Enable job alerts via email!

Cloud Resiliency Architect

Cognizant

Kuala Lumpur

On-site

MYR 120,000 - 160,000

Full time

14 days ago

Job summary

A leading technology services company is seeking a Cloud Resiliency Architect in Kuala Lumpur. You will support customers in enhancing their cloud environments' reliability and recoverability. The role involves conducting technical analyses, assessing disaster recovery strategies, and ensuring best practices are followed. Candidates must have significant experience in Microsoft Azure and strong communication skills. Certifications in Azure are required.

Qualifications

  • Minimum 5+ years of hands-on experience.
  • Deep experience in Microsoft Azure.
  • Solid understanding of disaster recovery and business continuity.

Responsibilities

  • Assess resiliency posture of customer environments.
  • Evaluate disaster recovery plans and business continuity strategies.
  • Conduct technical analysis of cloud infrastructure.

Skills

High availability (HA) design
Disaster recovery (DR) planning
Azure architecture
Incident response coordination
Excellent communication skills

Education

Microsoft Certified: Azure Solutions Architect Expert
ITIL Foundation Certification
Microsoft Certified: Azure Administrator Associate

Tools

Azure Site Recovery
Azure Backup Vaults
Azure Automation
Job description
Overview

The Cloud Resiliency Architect will support customers through focused technical engagements aimed at improving the reliability, continuity, and recoverability of their cloud environments.

The architect will combine deep technical expertise in Microsoft Azure with a strong command of disaster recovery, business continuity, and operational resiliency processes, including the design and evaluation of Major Incident Response Plans (MIRPs).

This role requires the ability to assess both cloud infrastructure and resiliency processes end-to-end, identifying risks, aligning to best practices, and delivering clear, actionable recommendations that strengthen the customer 0ability to withstand and recover from disruptions.

Key responsibilities
  • Assess the resiliency posture of customer environments from both technical and operational perspectives.
  • Evaluate the effectiveness of disaster recovery (DR) plans, business continuity (BC) strategies, and Major Incident Response Plans (MIRPs).
  • Conduct in-depth technical analysis of cloud infrastructure, application dependencies, observability configurations, and failover capabilities.
  • Review and enhance MIRPs, including escalation protocols, incident communication plans, recovery sequencing, and impact containment strategies.
  • Identify risks, vulnerabilities, and single points of failure across workloads and operational processes.
  • Recommend improvements aligned with the Azure Well-Architected Framework, SRE principles, and ITIL practices.
  • Engage customer teams to understand RTO/RPO targets, recovery workflows, and coordination models for major incidents.
  • Deliver professional, customer-facing documentation summarizing technical findings and process maturity recommendations.
Required Qualifications
  • Minimum 5+ years of hands-on experience in the following areas:
  • Deep experience with high availability (HA) and disaster recovery (DR) design in Microsoft Azure.
  • Solid understanding of Azure architecture, including infrastructure, Availability Zones, backup/recovery, and monitoring services.
  • Familiarity with cloud-native resiliency patterns and site reliability engineering (SRE) methods.
  • Proven ability to assess and design effective Major Incident Response Plans (MIRPs) that align with operational SLAs and business risk tolerances.
  • Experience in business continuity planning, incident response coordination, and process maturity assessments.
  • Excellent communication and documentation skills for technical and executive stakeholders.
  • Background in technical consulting or assessment-based delivery engagements.
Preferred qualifications
  • Experience contributing to or leading the development of enterprise-wide MIRP and DR testing programs.
  • Familiarity with compliance frameworks such as ISO 22301, NIST SP 800-34, or SOC 2 Type II in the context of operational resilience.
  • Prior experience supporting regulated industries (e.g., finance, healthcare, government) with stringent uptime, data protection, or continuity mandates.
  • Hands-on experience with Azure BCDR tools such as Azure Site Recovery, Backup Vaults, Azure Automation, or Service Health Alerts.
  • Working knowledge of multi-cloud or hybrid-cloud environments and related resiliency implications.
  • Familiarity with chaos engineering or resilience testing tools and practices.
  • Experience integrating DevOps pipelines with resiliency validation steps (e.g., backup validation, DR simulation, alerting thresholds).
  • Exposure to incident simulation platforms, runbook automation, or response coordination tooling (e.g., Microsoft Sentinel, ServiceNow).
Certifications

Required

  • Microsoft Certified: Azure Solutions Architect Expert
  • ITIL Foundation Certification
  • Microsoft Certified: Azure Administrator Associate

Preferred

  • Microsoft Certified: DevOps Engineer Expert (Recommended)
  • BC/DR certifications such as CBCP, MBCI, ISO 22301 Lead Implementer, or equivalent industry-recognized certifications
  • Microsoft Certified: Cybersecurity Architect Expert
About Cognizant:

Cognizant (Nasdaq: CTSH) engineers modern businesses. We help our clients modernize technology, reimagine processes and transform experiences so they can stay ahead in our fast-changing world. Together, we\u2019re improving everyday life. See how at www.cognizant.com or @cognizant.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.