Enable job alerts via email!

Senior Site Reliability Engineer (Crypto Exchange)

Hyphen Connect

Singapore

On-site

SGD 90,000 - 120,000

Full time

7 days ago
Be an early applicant

Job summary

A forward-thinking crypto exchange company in Singapore is seeking a Senior Site Reliability Engineer to enhance system reliability and scalability. Responsibilities include designing infrastructure, automating deployment pipelines, and collaborating with engineering teams to improve efficiency. Ideal candidates will have over 5 years of experience and a strong background in low-latency systems and cloud-native platforms.

Qualifications

  • 5+ years of relevant experience as DevOps/ SRE Engineers.
  • Proven ability to participate in an on-call rotation with ownership.
  • Extensive experience operating low-latency, distributed systems.

Responsibilities

  • Design and maintain scalable infrastructure for a trading platform.
  • Enhance Kubernetes environments for stability and security.
  • Develop automation pipelines using Terraform and Ansible.

Skills

DevOps/SRE experience
Incident response ownership
Low-latency systems operation
Cloud-native platforms proficiency
Linux/Unix knowledge
Scripting in Bash, Go, or Python
Root cause analysis expertise
Infrastructure as code
GitOps workflows familiarity
Production systems ownership

Tools

Kubernetes
AWS
GCP
Terraform
Ansible
GitHub Actions

Job description

Senior Site Reliability Engineer (Crypto Exchange)

We are working with a decentralised exchange which looks to innovate on providing the best of CEXs and DEXs, focusing on building a safe, simple and scalable platform for trading. They differentiate themselves by offering institutional level systems and support whilst remaining on-chain and decentralised.

Seeking a Senior Site Reliability Engineer to join our team in ensuring the stability, scalability, and performance of a cutting-edge platform. You will balance production reliability with engineering-driven automation, reducing manual processes through innovative tooling and process improvements. This role requires a strong commitment to on-call ownership and a passion for building resilient, observable, and self-healing infrastructure.

Key Responsibilities
  • Design, implement, and maintain scalable infrastructure for a high-performance, low-latency trading platform.
  • Operate and enhance Kubernetes and Nomad-based environments to ensure system stability, scalability, and security.
  • Develop infrastructure automation and deployment pipelines using Terraform, Ansible, ArgoCD, and GitHub Actions.
  • Collaborate with engineering teams to streamline service onboarding, automate repetitive tasks, and improve deployment efficiency.
  • Enhance observability and reliability through improved logging, metrics, tracing, and alerting using the Grafana ecosystem.
  • Perform root cause analysis and postmortems for production incidents, driving continuous improvements in system resilience and incident response.
  • Work with security and compliance teams to ensure infrastructure meets regulatory and organizational standards.
  • Support multi-environment deployments (dev, staging, testnet, mainnet) with a focus on safe rollouts, rollbacks, and configuration management.
  • Contribute to capacity planning, cost optimization, and infrastructure scaling strategies to support platform growth.
Experience & Skills Requirements
  • 5+ years of relevant experience as DevOps/ SRE Engineers.
  • Proven ability to participate in an on-call rotation, demonstrating ownership in incident response and a focus on long-term system stability.
  • Extensive experience operating and maintaining low-latency, distributed systems in production environments.
  • Proficiency with cloud-native platforms and container orchestration tools, including AWS, GCP, Kubernetes, and Nomad.
  • Strong knowledge of Linux/Unix internals and the TCP/IP networking stack.
  • Proficiency in one or more of: Bash, Go, or Python.
  • Expertise in root cause analysis, performance tuning, and system-level debugging in complex service architectures.
  • Experience building and managing end-to-end infrastructure, including infrastructure as code, CI/CD pipelines, and monitoring systems.
  • Familiarity with modern GitOps workflows and tools such as GitHub Actions, ArgoCD, Argo Workflows, and Argo Events.
  • Ability to own production systems end-to-end, from infrastructure as code to automated monitoring and deployment workflows.
  • Pragmatic approach with a focus on depth, ownership, and a bias for action over broad familiarity.
  • Bonus: Experience with the Aeron messaging system is a strong advantage.

Create a Job Alert

Interested in building your career at Hyphen Connect Limited? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

First Name *

Last Name *

Preferred First Name

Email *

Phone

Resume/CV

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Website

Notice Period *

Current Annual Salary (with Currency) *

Expected Annual Salary (with Currency) *

Working Location *

EMEA - Europe

APAC - Asia Pacific

LATIN - Latin America

UAE

Others

Do you have any Web3 experience? * Select...

Web3 Vertical Experience *

Defi

NFT

Gamefi

Infrastructure

ZK

Exchanges

VC

Chain

DePin

Accelerator

Incubator

Trading

Asset Management

Others

Any personal experience in Web3 (e.g. side project, personal investment) if no professional experience. *

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.