Job Search and Career Advice Platform

Enable job alerts via email!

Senior Cloud Platform Engineer

Foodics

Saudi Arabia

On-site

SAR 200,000 - 300,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading restaurant management ecosystem provider in Saudi Arabia seeks a Staff Site Reliability Engineer to ensure the reliability and performance of their cloud-native platforms. As part of a high-impact engineering team, you will design scalable infrastructure solutions, lead incident response, and drive best practices in observability and incident management. Ideal candidates will have expertise in SRE principles, cloud technologies, and a collaborative approach across teams. Competitive compensation and growth opportunities are provided.

Benefits

Highly competitive compensation packages
Annual learning stipend
Exposure to cutting-edge cloud technologies

Qualifications

  • Strong background in SRE principles such as SLIs, SLOs, and SLAs.
  • Experience with Kubernetes and container orchestration.
  • Proven expertise in infrastructure as code and automation scripting.
  • Deep understanding of monitoring and alerting systems.

Responsibilities

  • Design and maintain scalable and fault-tolerant systems across multiple cloud providers.
  • Lead incident response efforts and conduct post-mortems.
  • Build and refine automated deployment pipelines.
  • Implement robust observability frameworks to detect performance issues.
  • Collaborate with development teams to integrate reliability.

Skills

SRE principles
Kubernetes
Terraform
Prometheus/Grafana
Cloud networking
Troubleshooting

Tools

AWS
Ansible
MySQL
Python
Job description
Who Are We❓

We Are Foodics! a leading restaurant management ecosystem and payment tech provider. Founded in 2014 with headquarter in Riyadh and offices across 5 countries, including UAE, Egypt, Jordan and Kuwait. We are currently serving customers and partners in over 35 different countries worldwide. Our innovative products have successfully processed over 6 billion (yes, billion with a B) orders so far! making Foodics one of the most rapidly evolving SaaS companies to ever emerge from the MENA region. Also Foodics has achieved three rounds of funding, with the latest raising $170 million in the largest SaaS funding round in MENA, boosting its innovation capabilities to better serve business owners.

The Job in a Nutshell💡

We are seeking a Staff Site Reliability Engineer (SRE) to join our high-impact engineering team. In this role, you will be responsible for ensuring the scalability, performance, and reliability of Foodics’ cloud-native platforms and services. You will design, implement, and evolve infrastructure solutions and operational processes that support millions of transactions daily, while championing best practices in observability, incident management, and resilience engineering. Your expertise will help us maintain world-class uptime and seamless customer experiences as we continue to grow at scale.

What Will You Do❓
  • Design and maintain scalable, highly available, and fault-tolerant systems across multiple cloud providers (AWS, OCI).
  • Lead incident response efforts, conducting blameless post-mortems and driving systemic improvements.
  • Build and refine automated deployment pipelines, ensuring fast, safe, and repeatable delivery of changes.
  • Implement robust observability frameworks (metrics, tracing, logging) to proactively detect and address performance issues.
  • Collaborate with development teams to embed reliability into every stage of the software lifecycle.
  • Optimize infrastructure costs while maintaining service quality.
  • Drive chaos engineering experiments to validate system resilience.
  • Document architecture, runbooks, and operational processes for internal and cross-team use.
What Are We Looking For❓

We’re looking for a reliability-focused engineer with strong technical depth, who thrives in solving complex operational challenges at scale. You must be hands‑on with distributed systems, cloud‑native platforms, and automation tools.

  • Strong background in SRE principles (SLIs, SLOs, SLAs) and operational excellence.
  • Experience with Kubernetes, container orchestration, and service mesh technologies.
  • Proven expertise in infrastructure as code (Terraform, Ansible, Crossplane optional) and automation scripting (Bash, Python, Go).
  • Deep understanding of monitoring and alerting systems (Prometheus/Grafana, ELK, Loki, Datadog, AWS CloudWatch).
  • Skilled in cloud networking, load balancing, API gateway management (NGINX, Kong, AWS API GW).
  • Solid experience with relational and NoSQL databases in production (MySQL/PostgreSQL, MongoDB, DocumentDB, Redis).
  • Familiarity with distributed tracing (Jaeger, OpenTelemetry) and chaos testing frameworks.
  • Excellent troubleshooting skills and ability to resolve high-impact incidents under pressure.
Who Will Excel
  • Candidates who successfully operated high‑traffic, mission‑critical platforms in a cloud‑native environment.
  • Candidates that demonstrate strong collaboration and communication skills across engineering, product, and business teams.
  • Candidates who bring a data‑driven approach to performance tuning and capacity planning.
  • Candidates that thrive in fast‑paced, high‑growth SaaS environments and embraces continuous improvement.
What We Offer You❗

We believe you will love working at Foodics!

  • Highly competitive compensation packages, including bonuses and potential equity.
  • Annual learning stipend and regular training to accelerate your career.
  • Exposure to cutting‑edge cloud technologies and large‑scale distributed systems.
  • A truly global team of over 30 nationalities in 14 countries.
  • Autonomy, challenging goals, and the chance to directly impact the reliability of platforms serving millions.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.