Enable job alerts via email!

productSite Reliability Engineer

RP International

Abu Dhabi Emirate

On-site

AED 200,000 - 300,000

Full time

5 days ago

Be an early applicant

Job summary

A leading technology consultancy is seeking a mid-senior level Product Site Reliability Engineer to ensure the performance and reliability of customer-facing products. The role involves automating workflows, enhancing system resilience, and collaborating with engineering teams. The ideal candidate has over 5 years in SRE or DevOps, strong analytical skills, and cloud platform expertise. This is a full-time position based in Abu Dhabi, UAE.

Qualifications

5+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering.
Strong analytical and problem-solving mindset.
Excellent communication and collaboration skills across cross-functional teams.

Responsibilities

Ensure availability, latency, scalability, and overall system health aligns with SLAs and SLOs.
Continuously improve monitoring, alerting, and observability capabilities.
Automate operational tasks to reduce manual work and improve efficiency.
Work closely with engineering teams to embed reliability into product design.

Skills

Deep knowledge of cloud platforms (AWS, GCP, or Azure)

Experience with containerization and orchestration (Docker, Kubernetes)

Proficiency in Infrastructure as Code tools (Terraform, Ansible)

Expertise in CI / CD tools (e.g., Jenkins, GitHub Actions)

Familiarity with observability and monitoring tools (Prometheus, Grafana)

Strong scripting and programming skills (Python, Go)

Understanding of distributed systems and database reliability (SQL / NoSQL)

Overview

We are looking for an experienced Product Site Reliability Engineer (SRE) to help ensure the performance, scalability, and reliability of our customer-facing products and platforms. The SRE designs resilient systems, automates workflows, and builds observability into the product lifecycle to enable fast-paced innovation without compromising stability.

Responsibilities

System Reliability & Performance
- Ensure availability, latency, scalability, and overall system health aligns with SLAs and SLOs.
- Continuously improve monitoring, alerting, and observability capabilities.
- Lead root cause analysis and conduct blameless postmortems.
- Develop and maintain incident response playbooks to reduce MTTD and MTTR.
Automation & Tooling
- Automate operational tasks to reduce manual work and improve efficiency.
- Build and maintain CI / CD pipelines and infrastructure as code (IaC) for seamless product delivery.
Collaboration with Product & Engineering
- Work closely with engineering teams to embed reliability into product design.
- Promote best practices such as chaos testing, capacity planning, and progressive deployment strategies (blue / green, canary releases).
- Define, measure, and track key reliability metrics (SLIs, SLOs, error budgets).
- Identify and implement infrastructure and architectural improvements to enhance system resilience.

Required Skills & Experience

Technical Skills

Deep knowledge of cloud platforms (AWS, GCP, or Azure).
Experience with containerization and orchestration (Docker, Kubernetes).
Proficiency in Infrastructure as Code tools (Terraform, Ansible, or similar).
Expertise in CI / CD tools (e.g., Jenkins, GitHub Actions, GitLab CI).
Familiarity with observability and monitoring tools (Prometheus, Grafana, Datadog, New Relic).
Strong scripting and programming skills (Python, Go, Bash, or similar).
Understanding of distributed systems, networking, and database reliability (SQL / NoSQL).

Professional Skills

5+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering.
Strong analytical and problem-solving mindset.
Excellent communication and collaboration skills across cross-functional teams.
Demonstrated experience in incident management and conducting postmortems.

Seniority

Mid-Senior level

Employment Type

Full-time

Job Function

Engineering and Information Technology

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

productSite Reliability Engineer

RP International

Abu Dhabi Emirate

On-site

AED 200,000 - 300,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Company

Services

Free resources

Support

productSite Reliability Engineer

RP International

Abu Dhabi Emirate

On-site

AED 200,000 - 300,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Follow us

Company

Services

Free resources

Support