Enable job alerts via email!

Senior Site Reliability Engineer (SRE) - 10+ Yrs

GSSTech Group

Dubai

On-site

AED 300,000 - 400,000

Full time

30+ days ago

Job summary

A leading technology company in Dubai is seeking a highly experienced Senior Site Reliability Engineer (SRE) with 10–15 years of experience, specifically in the banking domain. This role involves ensuring the reliability, scalability, and performance of critical banking applications. Candidates must possess strong problem-solving skills and have a background in automation and cloud infrastructure management. Knowledge of monitoring tools like Prometheus and Grafana, as well as cloud platforms, is essential. Competitive compensation is offered for this position.

Qualifications

  • 10–15 years of experience in Site Reliability Engineering or related fields.
  • Mandatory banking domain experience with a deep understanding of financial systems.
  • Strong knowledge of cloud platforms and container orchestration.

Responsibilities

  • Ensure system reliability, scalability, and performance across banking applications.
  • Manage automation, incident management, and performance optimization.
  • Oversee cloud infrastructure management.

Skills

Problem-solving skills
Communication skills
Troubleshooting skills

Tools

Prometheus
Grafana
Splunk
AWS
Azure
GCP
Kubernetes
Docker
Terraform
Ansible
Job description

We are seeking a highly experienced Senior Site Reliability Engineer (SRE) with 10–15 years of experience and a proven background in the banking domain. The role involves ensuring system reliability, scalability, and performance across critical banking applications. You will be responsible for automation, incident management, performance optimization, and cloud infrastructure management. Strong problem-solving skills and the ability to work in high-pressure environments are essential.

  • 10–15 years of experience in Site Reliability Engineering or related fields.
  • Mandatory banking domain experience with deep understanding of financial systems.
  • Expertise in monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Splunk).
  • Strong knowledge of cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes, Docker).
  • Proficiency in scripting/programming languages (Python, Shell, Go, or similar).
  • Experience in CI/CD pipelines, Infrastructure as Code (Terraform, Ansible).
  • Strong background in incident management, root cause analysis, and problem resolution.
  • Understanding of security best practices and compliance in banking environments.
  • Excellent troubleshooting skills for large-scale distributed systems.
  • Strong communication, collaboration, and stakeholder management skills.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.