Job Search and Career Advice Platform

Enable job alerts via email!

CloudSite ReliabilityEngineer

SOLACE SYSTEMS SINGAPORE PTE. LTD.

Singapore

Hybrid

SGD 100,000 - 125,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading tech firm in Singapore is seeking a Cloud Site Reliability Engineer to manage the daily operations of its SaaS product. The ideal candidate will ensure service reliability, enhance tooling, and provide support for cloud environments. Extensive experience with public cloud providers, Kubernetes, and a passion for technology is required. This role offers a hybrid work model and opportunities for growth.

Benefits

Work with brilliant minds
Hybrid work flexibility
Growth mindset training programs
Values-driven culture
Social and fun environment

Qualifications

  • Hands-on experience with public cloud providers and features.
  • Expert knowledge in Kubernetes infrastructure.
  • Experience in debugging production alerts and incidents.
  • Strong communication skills for customer interactions.

Responsibilities

  • Ensure Solace Cloud Services are reliable and meet SLAs.
  • Improve infrastructure tooling, observability, and automation.
  • Troubleshoot and resolve operational issues with customers.
  • Provide on-call support with an efficient incident management process.

Skills

Public cloud providers (AWS, Azure, GCP)
Kubernetes infrastructure platforms
Monitoring tools (Datadog, Kibana, Prometheus)
Infrastructure Automation (Terraform, Cloud Formation)
Linux Operating Systems
Programming languages (Groovy, Python, Go)
Certified Kubernetes Administrator
Certified Cloud Administrator (AWS, Azure, GCP)
Job description

Solace helps companies connect and integrate all of their assets through the power of event-driven architecture. Our technology makes it easy to unlock data silos and capture events occurring across large enterprises; stream information about those events everywhere it needs to be in real-time; and give the apps, AI agents and people who receive it the power to immediately react with decisive actions and smart decisions.

Many of the world’s biggest companies trust Solace to modernize their IT infrastructure by embracing trends like AI, cloud and IoT so they can create awesome experiences for their customers, partners and employees.

So, the next time you drive a car, order furniture online, fly in a plane, check your bank balance on your phone, your positive experience could be a direct result of our technology—and your hard work!

Overview

This positionis for aCloudSite ReliabilityEngineer in either Singapore or India (Bangalore). You willbe responsible forthe daily operations of Solace Cloud,our market-leading SaaS offering, acrossleading cloud providersand platforms such as Amazon Web Services, Microsoft Azure, Google Cloud Platform, Kubernetes, etc.

What You'll Do:
  • Ensuring that the Solace Cloud Services are healthyand reliable,and that SLAs are beingmet
  • Improving our infrastructure tooling, observability, and automation
  • Contribute to making the production operations more efficient, less error-prone, etc.
  • Expert knowledge of handlingproductionIncidentsin production-grade multi-cloud environmentsaccording to industry-standard Incident management processes
  • ProcessHandling service requests and provisioning by thecustomers.
  • Work directly with customers to identify, troubleshoot, and resolve operational issues
  • Expert debugging knowledge in Linux and Kubernetes to detect maliciousactivity.
  • Be on-call rotation and provide 12x7 off-hourssupport
Ideally,You Will Be:
  • Highly technical, excited by technology and eager to stay up to date in a rapidly evolvingenvironment.
  • Experienced inCloudNetworking Solutions
  • Knowledgeablein demonstratingthe abilityto debug at a system level and resolve incidents in complex cloud-based environments
  • Experienced in Sitereliabilityengineeringand Incident response
  • A strong communicator who can articulate complex technical issuesclearly and concisely& get on the phone withcustomers.
  • Experienced in SaaS operations and customer-facing technical support
Required Skills:
  • Hands-on experience with public cloud providers (AWS, Azure, GCP)services & features
  • Hands-on experience with cloud Kubernetes infrastructure platforms such as AWS Elastic Kubernetes Service, Azure Kubernetes Service, Google KubernetesService
  • Hands-on experience with Monitoring tools likeDatadog,Kibana, and Prometheus etc.
  • Hands-on experience with Infrastructure Automation using Terraform, Cloud Formation
  • Hands-on experience with debugging production alerts
  • Expert-level understanding ofLinux OperatingSystems
  • Programmer in languages such as Groovy, Python, go
  • Certified Kubernetes Administrator
  • Certified Cloud Administrator (AWS, Azure, or GCP)
Why You’ll Love Working at Solace

At Solace, we’re all about smart people, meaningful work, and good vibes.

  • Work with brilliance– Our team is packed with some of the sharpest minds in the industry.
  • Balance matters– We believe work should fit into your life, not the other way around.
  • Hybrid-first– Flexibility is built in how we work, so everyone feels included and empowered.
  • Values-driven– We live and breathe our core values: craftsmanship, trust, courage, freedom, momentum, humility, and human experience.
  • Growth mindset– Our training programs are designed to help you level up, fast.
  • Customer Obsessed– We’re proud of our world-class customer lineup(we’re not shy about it).
  • Keep it fun– We’re social, we keep things simple, and we know how to have a good time.
  • Creative culture– We’ve got a great sense of humour and we make cool videos on topics like MITTandthis(check them out!).
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.