Job Search and Career Advice Platform

Enable job alerts via email!

Manager Site Reliability Engineer

Sana Commerce

Dubai

Hybrid

AED 200,000 - 300,000

Full time

2 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A fast-growing SaaS company is looking for a Manager SRE to lead its global Site Reliability Engineering team in Dubai. This role demands extensive experience in cloud computing and requires strong leadership skills. The ideal candidate should have at least 5 years in SRE and a proven track record in managing teams. Key responsibilities include overseeing reliability, driving automation, and ensuring scalability while working closely with cross-functional teams. Attractive perks include work-from-anywhere options and a hybrid model.

Benefits

Flexibility to work from anywhere
Hybrid working model
Weekly company lunch

Qualifications

  • At least 5 years of experience in Site Reliability Engineering, with 2+ years in a leadership or management role.
  • Proven expertise in cloud computing platforms and container orchestration.
  • Excellent problem-solving skills with a proven ability to tackle complex issues under pressure.

Responsibilities

  • Lead the SRE team to achieve high reliability while balancing cost and performance SLAs.
  • Collaborate with platform and product engineering teams to embed reliability best practices.
  • Oversee incident management, post-mortem analyses, and root cause investigations.

Skills

Site Reliability Engineering
Cloud computing platforms (AWS, Azure, GCP)
Container orchestration (Kubernetes)
Network protocols
Load balancing
High availability configurations
Programming languages (C#, Python, Go, Java)
Automation tools (Terraform, Ansible)
Monitoring tools (Prometheus, Grafana, ELK Stack)

Education

Bachelor’s or Master’s degree in Computer Science, Engineering

Tools

Terraform
Ansible
Prometheus
Grafana
ELK Stack
Dynatrace
Job description
Overview

At Sana Commerce we're committed to an inclusive environment and recognize that our diverse workforces one of our greatest strengths.

It all started in 2007, with a pizza and a plan. Sana Commerce is an e-commerce platform designed to help manufacturers, distributors and wholesalers succeed by fostering lasting relationships with customers who depend on them. We’re a fast-growing SaaS company that allows you to take ownership of your career.

At Sana Commerce, we're looking for a Manager SRE to build & manage our global SRE team that manages and monitors all installed systems, environments and infrastructure and resolves issues that come in through our notification system.

What you'll get
  • The opportunity to make an impact at a fast-growing SaaS scale-up;
  • Up to 5weeks “work from anywhere” per year;
  • A globaland customized onboarding program (9,1/10 rated by previous hires);
  • A hybrid working model – 3days from the office, 2day from home;
  • Weekly company lunch on us.
What you'll be doing
  • Leading the SRE team, setting objectives, and guiding the team towards achieving high reliability while balancing cost and performance SLAs.
  • Collaborating with platform & product engineering teams to embed reliability and operational best practices into the software development lifecycle.
  • Developing and implementing SRE policies and practices, including service level objectives (SLOs), service level indicators (SLIs), and error budgets.
  • Driving automation across operations to reduce toil, improve system performance, ensure scalability, with a reasonable amount of allergic response towards repetitive manual work.
  • Overseeing incident management, post-mortem analyses, and root cause investigations to prevent future outages and enhance system reliability.
  • Facilitating capacity planning and scalability exercises to manage growth and ensure the efficient use of resources.
  • Facilitating disaster recovery plans & testing to ensure business continuity for our customers’ webstores.
  • Encouraging a culture of continuous improvement by mentoring team members and fostering innovation within the team.
  • Staying up to date with the latest trends and technologies in SRE and advocating for their adoption where appropriate.
Qualifications
  • What you'll bring:
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
  • At least 5 years of experience in Site Reliability Engineering, with 2+ years in a leadership or management role.
  • Proven expertise in cloud computing platforms (e.g., AWS, Azure, GCP) and experience with container orchestration (e.g., Kubernetes).
  • A deep understanding of network protocols, load balancing, and high availability configurations.
  • Experience in applying software development solutions to SRE and familiarity with programming languages such as (preferably) PowerShell and C# or else Python, Go, Java etc.
  • Experience with automation tools, infrastructure as code (e.g., Terraform, Ansible).
  • Proficiency in monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) and in implementing comprehensive monitoring solutions. Dynatrace knowledge is a plus.
  • Excellent problem-solving skills, with a proven ability to tackle complex issues under pressure.
  • Outstanding leadership qualities, with a track record of mentoring and developing high-performing teams.
  • Exceptional communication and collaboration skills, capable of working effectively with cross-functional teams.

Additional Information

#LI-Hybrid

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.