Enable job alerts via email!

SRE Lead - Top tier Crypto Exchange - J12354

Pinpoint Asia Limited

Singapore

On-site

SGD 80,000 - 120,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm in Singapore is seeking an experienced Site Reliability Engineer to lead their SRE team. Responsibilities include establishing incident response systems and collaborating across teams to drive reliability goals. Candidates should have over 8 years of experience in back-end engineering, 4 years in SRE, and management experience. This position provides an opportunity to work in a cutting-edge digital economy environment, focusing on maintaining high availability and security of web3 solutions.

Qualifications

  • 8+ years of back-end/platform/operation experience.
  • 4+ years of SRE or production engineering experience.
  • 2+ years of team management or leadership experience.

Responsibilities

  • Form and manage the SRE team.
  • Drive the inclusion of reliability goals in the roadmap.
  • Conduct audits of sensitive operations.

Skills

Back-end/platform/operation experience
Incident handling in high-concurrency systems
Observability systems (Prometheus/Grafana)
Kubernetes and CI/CD tools
Performance and capacity engineering

Tools

Kubernetes
MySQL
Kafka
Redis
Job description

Our client is a leading web3 firm that offers a cutting‑edge, user‑friendly solution that combines industry‑leading security features with a powerful, intuitive interface in today's fast‑paced digital economy, managing your cryptocurrency assets with security and ease. Their platform and wallet empower you to store, send, and receive a wide range of digital assets effortlessly. Built with advanced encryption protocols to ensure your assets are always protected, giving you peace of mind in a constantly evolving market. They are presently expanding their business and looking for an experienced Site Reliability Engineer to join their exchange team

About the Role

As an SRE Lead, forming and managing the SRE team will form part of the mandate. You will also need to establish a unified incident response system and promote a no‑responsibility review and systematic improvements.

Key Responsibilities
  • Strategy and Governance
  • Team and Organization
  • Cross‑team collaboration, working with R&D, architecture, DBA, network, security, legal/compliance, to drive the inclusion of reliability goals in the roadmap and KPIs.
  • Platform and Engineering Implementation
  • Exchange Scenario Special Project, like end‑to‑end latency SLI, matching confirmation and replay, serial number consistency, and idempotence, isolation of hot trading pairs.
  • Multi‑chain node operation and maintenance, congestion and reorg handling, MPC/HSM, risk control, and approval flow for coin withdrawal and deposit, closed loop for reconciliation errors.
  • Security and Compliance: Audit of sensitive operations, meeting requirements such as SOC2/ISO 27001/PCI‑DSS.
Requires Skills & Experience
  • Over 8 years of experience in back‑end/platform/operation and maintenance engineering, over 4 years of SRE or production engineering experience, and over 2 years of team management/leadership experience.
  • Having successful cases of stability governance and incident handling in high‑concurrency and low‑latency businesses (transactions/payments/advertising/large‑scale real‑time systems).
  • SLO/SLI and incorrect budgeting practices, observability system construction (Prometheus/Grafana/ELK or similar, OpenTelemetry, Tracing).
  • Kubernetes/Service Mesh, microservice gateway (Nginx/Envoy), CI/CD (GitHub Actions/GitLab CI, etc.), GitOps (Argo CD).
  • Design and implementation of progressive delivery (Canary/Batch/feature Switch) and automatic rollback strategies.
  • Data and Storage: MySQL/ Sharding/Replication and Failover, Redis/Kafka, Backup and Disaster Recovery Drills; Consistency and reconciliation thinking.
  • Performance and Capacity Engineering: Stress testing, benchmarking, analysis, and tuning (flame diagram /CPU/GC/ Network /TCP kernel parameters, etc.).
  • Event management: SEV grading, IM/IC command, cross‑team collaboration and communication, writing high‑quality retrospectives, and tracking action items.
Preferred Experience
  • Experience in exchange/matching/payment clearing and settlement/operation, and maintenance of securities firms or crypto wallets and chain nodes.
  • Experience in implementing anti‑ddos, WAF, Bot management, rate limiting, and traffic governance systems.
  • Experience in compliance systems (SOC2, ISO 27001, PCI‑DSS, SOX‑class controls), security audits, and evidence retention.
  • Experience in multi‑region GSLB, cross‑cloud/multi‑cloud architecture, Chaos engineering, and GameDay organization.
  • Go/Java optimization experience, practical experience in messaging systems (Kafka/RocketMQ/Pulsar) and storage (TiDB/Vitess/Citus/TDSQL, etc.).
  • Have experience in cost optimization and FinOps.

With well over a decade of a solid and enviable track record behind us, headquartered in Hong Kong, Pinpoint Asia Infotech Pte Ltd (EA License: 16C8291) is the go‑to IT Search Firm for several top Investment Banks and Financial Institutions.

If you are interested in the above position, please send your CV to Charlie Kim @ resume.sg@pinpointasia.com (EA Registration number: Reg No: R23112483)

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.