Enable job alerts via email!

SRE Engineer

Numi

Dubai

On-site

AED 293,000 - 441,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology company is seeking a Site Reliability Engineer to work in a hybrid/remote environment. The role involves improving live microservices, collaborating with development teams, and managing cloud infrastructure using AWS. Ideal candidates will have a background in microservices, debugging, and a comfort with tools like Docker and Prometheus. This position offers the opportunity to shape the company's infrastructure as it evolves and grow within a dedicated team.

Qualifications

  • Experience debugging live applications and resolving production issues quickly.
  • Background in building and supporting microservice-based applications.
  • Experience with AWS services and containerisation tools.
  • Familiarity with infrastructure-as-code and CI/CD pipelines.
  • Comfortable using monitoring/observability tools.

Responsibilities

  • Hunt down bugs in live microservices for stability.
  • Collaborate with dev teams to boost code quality.
  • Manage cloud infrastructure (AWS) effectively.
  • Build better monitoring and alerting systems.
  • Solve complex technical puzzles for escalated issues.
  • Write and maintain automation scripts for efficiency.
  • Participate in the on-call rota for live issues.

Skills

Debugging live applications
Building microservice-based applications
Working with AWS
Using MongoDB
Containerisation tools (Docker)
Infrastructure as code
CI/CD pipelines
Scripting (Python or JS)
Monitoring tools (Prometheus, Grafana)

Tools

AWS
Docker
CI/CD pipelines
Prometheus
Grafana
Job description

Role: Site Reliability Engineer ()

Location: Hybrid / Remote (UK-based)

Tech Stack: AWS MongoDB Docker CI/CD Prometheus Python

Why This Role

Looking to work at the intersection of DevOps backend engineering and real-time problem-solving Heres your chance to make a real impact in a high-scale cloud environment keeping production systems fast reliable and resilient for thousands of users.

Youll join a collaborative tech-savvy team dedicated to making things just work better. From improving observability across microservices to responding to high-priority incidents this is your platform to shape how scalable applications are delivered and supported.

What Youll Be Doing
  • Fix and improve: Hunt down bugs in live microservices and make production more stable every day.
  • Pair up with engineers: Collaborate with dev teams to sharpen code quality boost resilience and embed observability from the start.
  • Own the cloud: Configure and manage cloud infrastructure (AWS) keeping everything humming at scale.
  • Watch the signals: Build better monitoring and alerting systems to catch issues before they escalates.
  • Troubleshoot deeply: Solve complex technical puzzles and help guide others through them.
  • Automate everything: Write and maintain SOPs and automation scripts to reduce manual toil.
  • Be the calm in the storm: Participate in the on-call rota and take ownership of live issues when they arise.
What Were Looking For
  • Solid experiencedebugging live applicationsand resolving production issues fast.
  • Background in building and supportingmicroservice-based applications.
  • Confidence working withMongoDBAWS services and containerisation tools likeDockerorECS.
  • Familiarity withinfrastructure-as-codeand CI/CD pipelines (CloudFormation CodeBuild etc.).
  • Comfort usingmonitoring/observability toolslike Prometheus NewRelic Grafana or DataDog.
  • Good graspofscripting (Python or JS)for automation and tooling.
  • Clear thinking in the face of incidentsplus the drive to learn from them.
Bonus Points For
  • Knowledge ofREST GraphQL and async messaging systems.
  • Experience withGitworkflows andCI/CD pipelines.
  • Understanding ofSRE principles(SLIs SLOs error budgets etc.).
  • Awarenessofsecurity and compliance(GDPR privacy risk management).
  • Clear communicator with a team-first attitude.
Why Youll Love It Here
  • Youll work with brilliant engineerswho care about quality automation and clean code.
  • Youll have thefreedom to shape infrastructureas we scale and evolve.
  • Youll gain deep exposure tomodern DevOps tooling incident response strategy and production engineering.
  • Your voice will matterfrom tech choices to process improvements.

Apply direct or contact

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.