Enable job alerts via email!

Senior Production Engineer

Borr Drilling

Singapore

On-site

SGD 85,000 - 130,000

Full time

7 days ago

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Start fresh or import an existing resume

Job summary

A well-funded global Crypto Custodian Firm is seeking a Senior Production Engineer in Singapore for ensuring the reliability and scalability of operations. The role requires a blend of software engineering and operational expertise, overseeing system performance, and driving a culture of reliability and continuous improvement.

Qualifications

8+ years in software engineering and production infrastructure roles.
Strong coding skills in Java; experience with Spring framework.
Deep hands-on experience with Azure cloud services.

Responsibilities

Responsible for production stability, including incident response and infrastructure.
Implement and maintain SLIs, SLOs, and error budgets.
Lead root cause analysis and resilience improvements.

Skills

Software Engineering

Production Infrastructure

Java

Azure

Site Reliability Engineering

Distributed Systems

Microservices Architecture

Container Orchestration

Education

Degree in a STEM subject

Tools

Terraform

Grafana

Prometheus

A well-funded global Crypto Custodian Firm is seeking a Senior Production Engineer to join them here in Singapore.

Role Summary

As a Senior Production Engineer, you will be responsible for the reliability, scalability, and observability of our platform. This role combines software engineering expertise with deep operational ownership. You will define and implement the strategies that keep our systems resilient, performant, and secure — while representing system health and risks during product roadmap discussions.

You will be both a hands-on technical leader and a strategic contributor, working across Engineering and Product to build a world-class digital assets infrastructure. This will involve developing our existing product suite and integrating with a variety of third party services to form a digital asset trading ecosystem.

Key Responsibilities

Responsible for production stability, including site reliability engineering, infrastructure, observability, and incident response.
Provide thought-leadership and architectural design input to the Engineering team with respect to system resilience and scale.
Drive a culture of proactive reliability, incident learning, and continuous improvement.
Own and evolve our monitoring, alerting, and incident management frameworks.
Lead root cause analysis, postmortems, and resilience improvements.
Implement and maintain SLIs, SLOs, and error budgets in collaboration with platform and product teams.
Develop strategies for system scaling (horizontal and vertical), performance tuning, and capacity planning on Azure.
Lead engineering team efforts in disaster recovery, and failover planning.
Design and implement tools and automation to support self-healing, auto-scaling, and rapid recovery systems.
Hands-on contribution to the backend codebase (Java/Spring) to improve runtime performance, observability, and fault tolerance.
Represent platform stability, risk, and incident trends in Product Prioritisation and Planning meetings.

Required Skills and Experience

8+ years in software engineering and/or production infrastructure roles.
Strong coding skills in Java and experience with the Spring ecosystem.
Deep hands-on experience with cloud services, ideally Azure, including AKS, Azure Monitor, Application Insights, and Key Vault.
Expertise in observability tooling (e.g., Graylog, Prometheus, Grafana, ELK, OpenTelemetry).
Proven experience in running mission-critical, high-uptime services - ideally in fintech, crypto, or other transactional environments.
Solid understanding of distributed systems, microservices architecture, and container orchestration (Kubernetes and Docker).
Experience with Infrastructure as Code tools (Terraform, Bicep, or similar).

Preferred Qualifications

Familiarity with blockchain systems, digital asset custody, or crypto exchange platforms.
Strong skills in RDBMS performance tuning, ideally MS SQL database
Experience with regulated environments (e.g., financial compliance, GDPR, ISO 27001).
Strong understanding of SLAs/SLOs/SLIs and the principles of site reliability engineering.
Exposure to chaos engineering, performance testing, and auto-remediation strategies.
Good degree in a STEM subject.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Senior Production Engineer

Borr Drilling

Singapore

On-site

SGD 85,000 - 130,000