Job Search and Career Advice Platform

Enable job alerts via email!

SRE Engineer (Azure)

QONSULT SYSTEMS PTE. LTD.

Greater London

On-site

GBP 60,000 - 80,000

Full time

2 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A cloud solutions provider is looking for an experienced SRE Engineer specializing in Azure to ensure the reliability and performance of cloud platforms. The role involves designing high-availability systems, implementing automation, and working across teams to maintain robust CI/CD pipelines. Candidates should have a Bachelor's degree and at least 4 years of experience in cloud reliability engineering, with proficiency in tools like Terraform and Azure Monitor. This position emphasizes security best practices and operational excellence.

Qualifications

  • 4+ years of experience in cloud reliability/SRE role focused on Azure.
  • Strong understanding of logging and monitoring tools in Azure.
  • Hands-on experience with Infrastructure as Code tools.

Responsibilities

  • Design and maintain monitoring dashboards across Azure services.
  • Implement high-availability and reliability solutions in cloud infrastructure.
  • Ensure compliance with cloud security controls and document procedures.

Skills

Incident & Problem Management
Configuration & Change Management
Observability and Reliability Engineering
Strong communication & stakeholder engagement
Ability to work effectively across technical teams

Education

Bachelor’s Degree in Computer/Information Science or equivalent

Tools

Terraform
Bicep
PowerShell
Python
Azure Monitor
Log Analytics
App Insights
AKS
GitHub Actions
Azure DevOps
Job description
SRE Engineer (Azure)

We are looking for an Azure SiteReliability Engineer (SRE) to ensure the reliability, scalability, and performance of our cloud platforms. The SRE Engineer will architect, implement, and operate highly available systems with a strong emphasis on automation,observability, and security best practices.

The candidate will work closely with engineering and project teams to ensure our Azure services meet organizational objectives for performance, resilience, and cost-efficiency.

Responsibilities
  • Demonstrate expertise in cloud reliability engineering, high-availability patterns, observability frameworks, and automation with a security-first mindset.
  • Design, implement, and maintain SLOs, SLIs, monitoring dashboards, and automated alerting mechanisms across Azure services.
  • Ensure reliability of mission-critical systems by implementing autoscaling, redundancy, failover, and resilient architectures.
  • Develop automation using Terraform/Bicep, PowerShell, and Python to reduce operational toil and improve system reliability.
  • Collaborate with engineering teams to support secure, reliable CI/CD pipelines and deployment processes.
  • Conduct root cause analysis (RCA), implement corrective actions, and lead continuous improvement of reliability processes.
  • Continuously monitor Azure resources and optimize performance, cost, and operational health based on best practices.
  • Ensure all deployed workloads comply with cloud security baselines, network boundary controls, and governance frameworks (e.g., IM8, CIS, NIST).
  • Improve infrastructure readiness through chaos engineering, failover tests, and resilience validation.
  • Prepare operational runbooks, architecture documents, and technical guides for cloud reliability operations.
  • Support Agile workflows and collaborate across teams to integrate operational excellence into the development lifecycle.
Qualifications & Work Experience
  • Bachelor’s Degree in Computer/Information Science or equivalent.
  • 4+ years of experience in cloud reliability/SRE role with emphasis on Azure.
  • Strong understanding of Azure Monitor, Log Analytics, App Insights, AKS, VNets, Load Balancers, and HA designs.
  • Hands‑on experience with IaC tools such as Terraform, Bicep, or ARM templates.
  • Strong scripting capabilities (PowerShell/Python).
  • Experience with CI/CD pipelines (GitHub Actions, Azure DevOps).
  • Solid understanding of cloud security controls, compliance frameworks, and incident management.
  • Exceptional troubleshooting and problem‑solving skills.
Skills
  • Incident & Problem Management
  • Configuration & Change Management
  • Observability and Reliability Engineering
  • Strong communication & stakeholder engagement
  • Ability to work effectively across technical teams
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.