Job Search and Career Advice Platform

Enable job alerts via email!

Site Reliability Engineer (SRE)

Dicetek LLC

Dubai

On-site

AED 200,000 - 300,000

Full time

7 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A top technology service provider is seeking a skilled Site Reliability Engineer to enhance system reliability and manage cloud infrastructures. Candidates should have over 6 years of experience in site reliability, with solid expertise in AWS or Azure and strong Linux administration skills. Responsibilities include implementing SLIs and SLOs, incident management, and optimizing CI/CD pipelines. This role emphasizes automation and collaboration with development teams to improve system performance and reliability.

Benefits

Competitive salary
Health insurance
Flexible working hours

Qualifications

  • 6+ years of hands-on experience as a Site Reliability Engineer (SRE).
  • Experience supporting high-availability, mission-critical systems in production.
  • Strong documentation and stakeholder communication skills.

Responsibilities

  • Define, implement, and manage SLIs, SLOs, and Error Budgets across critical systems.
  • Design and maintain highly reliable, scalable, and fault-tolerant production environments.
  • Drive toil reduction, automation, and self-healing systems.

Skills

Site Reliability Engineering experience
Strong Linux system administration
Incident management
Cloud infrastructure (AWS, Azure)
Continuous Integration/Continuous Deployment
Observability and monitoring tools

Tools

Kubernetes
Terraform
GitLab
Prometheus
Grafana
Job description

๐—–๐—ผ๐—ฟ๐—ฒ ๐—ฆ๐—ธ๐—ถ๐—น๐—น๐˜€ ๐—ช๐—ฒโ€™๐—ฟ๐—ฒ ๐—Ÿ๐—ผ๐—ผ๐—ธ๐—ถ๐—ป๐—ด ๐—™๐—ผ๐—ฟ (๐— ๐˜‚๐˜€๐˜-๐—›๐—ฎ๐˜ƒ๐—ฒ)

6+ years of hands-on experience as a ๐—ฆ๐—ถ๐˜๐—ฒ ๐—ฅ๐—ฒ๐—น๐—ถ๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐˜† ๐—˜๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐—ฒ๐—ฟ (๐—ฆ๐—ฅ๐—˜).

Define, implement, and manage ๐—ฆ๐—Ÿ๐—œ๐˜€, ๐—ฆ๐—Ÿ๐—ข๐˜€, ๐—ฎ๐—ป๐—ฑ ๐—˜๐—ฟ๐—ฟ๐—ผ๐—ฟ ๐—•๐˜‚๐—ฑ๐—ด๐—ฒ๐˜๐˜€ across critical systems

Design and maintain highly reliable, scalable, and fault-tolerant production environments

Drive ๐˜๐—ผ๐—ถ๐—น ๐—ฟ๐—ฒ๐—ฑ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป, automation, and self-healing systems using ๐—ฃ๐˜†๐˜๐—ต๐—ผ๐—ป /๐—•๐—ฎ๐˜€๐—ต

Strong ๐—Ÿ๐—ถ๐—ป๐˜‚๐˜… ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ ๐—ฎ๐—ฑ๐—บ๐—ถ๐—ป๐—ถ๐˜€๐˜๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป experience in production environments

Manage and operate container platforms such as ๐—ž๐˜‚๐—ฏ๐—ฒ๐—ฟ๐—ป๐—ฒ๐˜๐—ฒ๐˜€ ๐—ผ๐—ฟ ๐—ข๐—ฝ๐—ฒ๐—ป๐—ฆ๐—ต๐—ถ๐—ณ๐˜

Implement ๐—œ๐—ป๐—ณ๐—ฟ๐—ฎ๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ฎ๐˜€ ๐—–๐—ผ๐—ฑ๐—ฒ (๐—œ๐—ฎ๐—–) using ๐—ง๐—ฒ๐—ฟ๐—ฟ๐—ฎ๐—ณ๐—ผ๐—ฟ๐—บ/๐—”๐—ป๐˜€๐—ถ๐—ฏ๐—น๐—ฒ

Build, maintain, and optimize ๐—–๐—œ/๐—–๐—— ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ๐˜€ using ๐—š๐—ถ๐˜๐—Ÿ๐—ฎ๐—ฏ, ๐—๐—ฒ๐—ป๐—ธ๐—ถ๐—ป๐˜€, ๐—ผ๐—ฟ ๐—”๐˜‡๐˜‚๐—ฟ๐—ฒ ๐——๐—ฒ๐˜ƒ๐—ข๐—ฝ๐˜€

Implement safe deployment strategies including ๐—•๐—น๐˜‚๐—ฒ-๐—š๐—ฟ๐—ฒ๐—ฒ๐—ป, ๐—–๐—ฎ๐—ป๐—ฎ๐—ฟ๐˜†, ๐—ฎ๐—ป๐—ฑ ๐—ฅ๐—ผ๐—น๐—น๐—ถ๐—ป๐—ด ๐—ฑ๐—ฒ๐—ฝ๐—น๐—ผ๐˜†๐—บ๐—ฒ๐—ป๐˜๐˜€

Hands-on experience with ๐—”๐—ช๐—ฆ, ๐—”๐˜‡๐˜‚๐—ฟ๐—ฒ, ๐—ผ๐—ฟ ๐—ฃ๐—ฟ๐—ถ๐˜ƒ๐—ฎ๐˜๐—ฒ ๐—–๐—น๐—ผ๐˜‚๐—ฑ infrastructure

Own incident management, on-call rotations, post-incident reviews, and ๐—ฅ๐—ผ๐—ผ๐˜ ๐—–๐—ฎ๐˜‚๐˜€๐—ฒ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜€๐—ถ๐˜€ (๐—ฅ๐—–๐—”)

Implement and manage ๐—ผ๐—ฏ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐˜† ๐—ฎ๐—ป๐—ฑ ๐—บ๐—ผ๐—ป๐—ถ๐˜๐—ผ๐—ฟ๐—ถ๐—ป๐—ด using Prom๐—ฒ๐˜๐—ต๐—ฒ๐˜‚๐˜€, ๐—š๐—ฟ๐—ฎ๐—ณ๐—ฎ๐—ป๐—ฎ, ๐—˜๐—Ÿ๐—ž/๐—˜๐—™๐—ž ๐˜€๐˜๐—ฎ๐—ฐ๐—ธ

Reduce alert noise and improve alert quality to avoid alert fatigue

Work closely with development and platform teams to improve system reliability

Ensure systems comply with ๐˜€๐—ฒ๐—ฐ๐˜‚๐—ฟ๐—ถ๐˜๐˜†, ๐—ด๐—ผ๐˜ƒ๐—ฒ๐—ฟ๐—ป๐—ฎ๐—ป๐—ฐ๐—ฒ, ๐—ฎ๐—ป๐—ฑ ๐—ฐ๐—ผ๐—บ๐—ฝ๐—น๐—ถ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐˜€๐˜๐—ฎ๐—ป๐—ฑ๐—ฎ๐—ฟ๐—ฑ๐˜€ (๐—œ๐—ฆ๐—ข, ๐—ก๐—œ๐—ฆ๐—ง, ๐—œ๐—ง๐—œ๐—Ÿ)

Experience supporting ๐—ต๐—ถ๐—ด๐—ต-๐—ฎ๐˜ƒ๐—ฎ๐—ถ๐—น๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐˜†, ๐—บ๐—ถ๐˜€๐˜€๐—ถ๐—ผ๐—ป-๐—ฐ๐—ฟ๐—ถ๐˜๐—ถ๐—ฐ๐—ฎ๐—น ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ๐˜€ in production

๐—ฃ๐—ฟ๐—ฒ๐—ณ๐—ฒ๐—ฟ๐—ฟ๐—ฒ๐—ฑ / ๐—š๐—ผ๐—ผ๐—ฑ ๐˜๐—ผ ๐—›๐—ฎ๐˜ƒ๐—ฒ

Experience in ๐—น๐—ฎ๐—ฟ๐—ด๐—ฒ-๐˜€๐—ฐ๐—ฎ๐—น๐—ฒ ๐—ด๐—ผ๐˜ƒ๐—ฒ๐—ฟ๐—ป๐—บ๐—ฒ๐—ป๐˜ ๐—ผ๐—ฟ ๐—ฟ๐—ฒ๐—ด๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐—ฑ ๐—ฒ๐—ป๐˜ƒ๐—ถ๐—ฟ๐—ผ๐—ป๐—บ๐—ฒ๐—ป๐˜๐˜€

Exposure to ๐—บ๐—ถ๐—ฐ๐—ฟ๐—ผ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ถ๐—ฐ๐—ฒ๐˜€-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ถ๐˜๐—ฒ๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ๐˜€

Familiarity with ๐—ฐ๐—ต๐—ฎ๐—ผ๐˜€ ๐˜๐—ฒ๐˜€๐˜๐—ถ๐—ป๐—ด ๐˜๐—ผ๐—ผ๐—น๐˜€ and advanced resilience frameworks

Strong ๐—ฑ๐—ผ๐—ฐ๐˜‚๐—บ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฎ๐—ป๐—ฑ ๐˜€๐˜๐—ฎ๐—ธ๐—ฒ๐—ต๐—ผ๐—น๐—ฑ๐—ฒ๐—ฟ ๐—ฐ๐—ผ๐—บ๐—บ๐˜‚๐—ป๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป skills

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.