Enable job alerts via email!

Senior Site Reliability Engineer ELK

SWIFT SUPPORT SERVICES MALAYSIA SDN. BHD.

Kuala Lumpur

On-site

MYR 100,000 - 130,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading financial messaging services provider in Kuala Lumpur is seeking an experienced Site Reliability Engineer. This role involves crafting end-to-end delivery pipelines, deploying infrastructure, and ensuring system reliability. The ideal candidate should have a minimum of 8 years of experience in SRE or software development, familiarity with big data technologies, and proficiency in CICD tools. The role offers a competitive package and the opportunity to make a significant impact in a diverse and inclusive environment.

Benefits

Competitive package

Career control

Performance support

Diverse and inclusive environment

Qualifications

Minimum 8 years of SRE/Software development experience.
Hands-on experience with big data technologies (Elastic Search, Logstash, Kibana, Kafka).
Experience with CICD tools including Maven, Jenkins, and Docker.

Responsibilities

Contribute to deployment phases ensuring production readiness.
Develop automation scripts and improve system reliability.
Collaborate with technical teams on integration solutions.

Skills

SRE/Software development

Data ingestion with big data technologies

CICD tools (Maven, Jenkins, Nexus, Git, Docker)

Linux OS proficiency

Scripting and automation (Python, PowerShell, YAML)

Distributed systems and microservices

ITIL processes

Agile working environment

Customer-centric mindset

Collaboration across cultures

Tools

Ansible

Terraform

Kubernetes

OpenShift

Overview

We’re the world’s leading provider of secure financial messaging services, headquartered in Belgium. We Move value securely—across borders, through cities and overseas—and support the global economy. Swift has a presence in 200+ countries and legal territories to serve a community of more than 12,000 banks and financial institutions.

Join our central DevOps Engineering Services organization at Swift as a Site Reliability Engineer. You’ll be pivotal in crafting end-to-end delivery pipelines, ensuring seamless integration, deployment of infrastructure and software, and providing essential maintenance and support to our developer community. With a strong focus on zero-trust strategies, problem-solving capabilities, and customer-oriented approaches, you’ll contribute to our transformative journey.

Responsibilities

Contribute to deployment phases with a focus on scalability, reliability, and operability of ELK and Kafka solutions. Ensure that production readiness is considered at every stage of the software lifecycle.
Develop automation scripts, infrastructure as code, and tooling using industry best practices to improve system reliability, reduce manual effort, and enable self-service.
Analyze production issues, identify root causes, and implement long-term reliability improvements through automation, alerting, monitoring, and architectural enhancements.
Work collaboratively with other team members and provide guidance to more junior team members.
Organize an efficient handover through high quality documentation and training.
Automate the deployment and operation of multi-tenant infrastructure, handling tasks that ensure system resilience and availability.
Develop and maintain monitoring tools, dashboards, and self-healing mechanisms.
Participate in on-call rotations, weekend deployment duty, conduct blameless postmortems, and drive continuous learning.
Work closely with developers, product teams, and engineering stakeholders to troubleshoot issues, improve systems, and integrate reliability improvements.
Collaborate with technical teams on operational concerns of integration solutions on ELK platform.

Qualifications

Minimum 8 years of SRE/Software development experience in an (preferably) international setting.
Familiarity or experience with data ingestion with big data technologies (Elastic Search, Logstash, Kibana and Kafka).
Experience with CICD development & deployment tools such as Maven, Jenkins, Nexus, Git, and Docker.
Proficiency in Linux OS.
Proficiency in scripting and automation (e.g. Python, PowerShell, YAML) with the ability to develop tools and infrastructure as code (Preferably Ansible, Terraform, Kubernetes, OpenShift).
Understanding of distributed systems and microservices architectures, including REST and SOAP APIs.
Hands-on experience with ITIL processes, including Incident, Problem, and Continual Improvement, is an advantage.
Experience working within an Agile-driven environment.
Practical experience in building metrics for data-driven reporting.
Strong interpersonal skills with a customer-centric mindset and ability to work effectively across diverse cultures.
Proven ability to collaborate with both local and remote teams across different time zones.
Familiarity with or experience in managing VM hosts using vCenter is an advantage.

What we offer

We put you in control of your career.
We give you a competitive package.
We help you perform at your best.
We help you make a difference.
We give you the freedom to be yourself.
We are creating an environment of unique individuals with different perspectives on the financial industry and the world. A diverse and inclusive environment where everyone’s voice counts and you can reach your full potential.

If you believe you require a reasonable accommodation to participate in the job application or interview process, please contact us to request accommodation.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions