Enable job alerts via email!

AVP, Site Reliablility Engineering Lead (SRE), Middle Office Technology, Technology & Operations

Quality Control Specialist - Pest Control

Singapore

On-site

SGD 70,000 - 120,000

Full time

3 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Start fresh or import an existing resume

Job summary

A leading bank is seeking a DevSecOps Site Reliability Engineer to enhance their operational systems and applications. This role focuses on automating processes, optimizing infrastructure, and collaborating with diverse teams to ensure robust software deployments. The ideal candidate will possess strong DevOps skills and experience with modern cloud technologies, working in a dynamic environment that encourages innovation.

Qualifications

  • Experience deploying applications in containerized environments (Docker, Kubernetes).
  • Strong understanding of DevOps tools and practices, including CI/CD.
  • Hands-on experience with cloud technologies and data manipulation.

Responsibilities

  • Develop and debug automated tasks for systems and infrastructure.
  • Troubleshoot incidents and perform root cause analysis.
  • Collaborate with teams to optimize software release processes.

Skills

DevOps practices
Automation
Troubleshooting
Collaboration
Problem-solving

Tools

Docker
Kubernetes
AWS
Terraform
Jenkins
SQL
Hadoop
NoSQL databases

Job description

Business Function

Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble, and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability, and innovation. In Group T&O, we manage the majority of the Bank's operational processes and aim to delight our business partners through our multiple banking delivery channels.

The Regulatory Reporting team leads the overall front-to-back strategy and development of the regulatory reporting landscape and drives business changes for the Group Finance Platform (GFP).

Responsibilities:

As a DevSecOps Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operational problems. Your focus will be on optimizing existing systems, building infrastructure, and reducing work through automation. You'll join a team of curious problem solvers with diverse perspectives who are thinking big and taking risks. In this environment, you'll lead relevant projects, supported by an organization that provides support and mentorship for your growth. Your main focus as an SRE will be on running better production applications and systems.

  • Develop, test, and debug automated tasks (Apps, Systems, Infrastructure)
  • Troubleshoot priority incidents, facilitate blameless post-mortems
  • Work with development teams throughout the software lifecycle ensuring sustainable software releases
  • Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
  • Build and promote adoption of greater self-healing and resiliency patterns
  • Lead and participate in performance tests; identify bottlenecks, opportunities for optimization, and capacity demands
  • Adhere to firm-wide architecture standards, risk management, and security policies
  • Collaborate with global teams, product owners, and business teams to develop, build, and support applications
  • Communicate and coordinate on development tasks with global teams, and resolve issues impacting development
  • Provide post-production application support
  • Participate in quality assurance, peer reviews, and code reviews
Qualifications:
  • Strong technical skills with innovative solutions that benefit our global customers
  • Thought leader in end-to-end solutioning within the group
  • Up-to-date with the latest industry technologies
  • Active engagement with industry members, attending developer and tech meetups
  • Strong understanding of DevOps practices, tools, and techniques
  • Experience deploying applications in containerized environments using Docker, Kubernetes, PCF, OpenShift, AWS, etc.
  • In-depth OS experience (RHEL, Ubuntu, Windows Server) with troubleshooting and problem-solving skills
  • Experience in site reliability engineering using languages like Python, Java, PowerShell, Shell scripting, or Go
  • Hands-on experience with cloud technologies and tools such as Prometheus, Splunk, Elasticsearch, Grafana
  • Knowledge of modern development tools and practices including Agile, CI/CD, Git, Terraform, Jenkins
  • Understanding of networking protocols and cybersecurity best practices in cloud environments
  • Advanced SQL knowledge and experience with RDBMS, Hadoop, and NoSQL databases
  • Experience performing root cause analysis and process improvement
  • Ability to manipulate, process, and extract value from large datasets
  • Knowledge of message queuing, stream processing, and scalable big data stores
  • Experience working with cross-functional teams in dynamic environments
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.