Enable job alerts via email!

AVP, Site Reliablility Engineering Lead (SRE), Middle Office Technology, Technology & Operations

Out in Science, Technology, Engineering, and Mathematics

Singapore

On-site

SGD 70,000 - 120,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Start fresh or import an existing resume

Job summary

A leading technology team seeks a DevSecOPS Site Reliability Engineer to optimize existing systems and infrastructure through automation. You'll be joining a motivated team, focused on solving operational problems while collaborating on innovative engineering solutions. This role involves running production applications, ensuring resilience and quality, and leading technical projects that support business goals.

Qualifications

  • Experience deploying applications in containerized environments (Docker, Kubernetes, etc.).
  • Strong understanding of DevOps practices, tools, and techniques.
  • Ability to perform root cause analysis and process improvement.

Responsibilities

  • Develop, test, and debug automated tasks (Apps, Systems, Infrastructure).
  • Collaborate with development teams to ensure sustainable software releases.
  • Provide post-production application support.

Skills

DevOps practices
Troubleshooting
SQL
Cloud technologies
Containerization
Agile development

Tools

Docker
Kubernetes
AWS
Terraform
Jenkins
Prometheus
Splunk
Elasticsearch
Grafana

Job description

Business Function

Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble, and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability, and innovation. In Group T&O, we manage the majority of the Bank's operational processes and aim to delight our business partners through our multiple banking delivery channels.

The Regulatory Reporting team leads the overall front-to-back strategy and development of the regulatory reporting landscape and drives business changes for the Group Finance Platform (GFP).

Responsibilities:

As a DevSecOPS Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Your focus will include supporting and developing software to optimize existing systems, infrastructure building, and automation to reduce manual work. You'll join a team of curious problem solvers with diverse perspectives, thinking big and taking risks. In this environment, you'll lead relevant projects with support from an organization that fosters learning and growth. Your core focus as an SRE will be on running better production applications and systems.

  • Develop, test, and debug automated tasks (Apps, Systems, Infrastructure)
  • Troubleshoot priority incidents and facilitate blameless post-mortems
  • Collaborate with development teams throughout the software lifecycle to ensure sustainable software releases
  • Analyze previous incidents and usage patterns to predict issues proactively
  • Build and promote adoption of self-healing and resiliency patterns
  • Lead and participate in performance testing; identify bottlenecks and optimization opportunities
  • Adhere to architecture standards, risk management, and security policies
  • Work effectively in a global team environment with product owners and business teams to develop, build, and support applications
  • Communicate and collaborate on development tasks with the global team and resolve issues impacting development
  • Provide post-production application support
  • Participate in quality assurance, peer reviews, and code reviews

Requirements:

  • Strong technical skills with innovative solutions aligned with customer interests globally
  • Thought leadership in end-to-end solutioning
  • Up-to-date knowledge of industry technologies
  • Active participation in developer and tech meetups
  • Strong understanding of DevOps practices, tools, and techniques
  • Experience deploying applications in containerized environments (Docker, Kubernetes, PCF, OpenShift, AWS)
  • In-depth OS experience (RHEL, Ubuntu, Windows Server) with troubleshooting skills
  • Experience in site reliability engineering using languages like Python, Java, PowerShell, Shell scripting, or Go
  • Hands-on experience with cloud technologies and tools (Prometheus, Splunk, Elasticsearch, Grafana)
  • Knowledge of modern development tools and practices (Agile, CI/CD, Git, Terraform, Jenkins)
  • Understanding of networking protocols and cybersecurity in cloud environments
  • Advanced SQL skills and experience with RDBMS, Hadoop, NoSQL databases
  • Ability to perform root cause analysis and process improvement
  • Experience working with large datasets and message queuing/stream processing
  • Proven ability to support cross-functional teams in dynamic settings
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.