Enable job alerts via email!

Technology / Domain Specialist II (Site Reliability Engineer)

Nedbank Private Wealth

Johannesburg

On-site

ZAR 700,000 - 1,000,000

Full time

30+ days ago

Job summary

A reputable financial institution is seeking a Technology Domain Specialist (Site Reliability Engineer) to lead initiatives in enhancing service reliability and efficiency. In this full-time role, you will work with various teams to implement SRE practices, engage in incident management, and coach junior engineers. The candidate should have a strong background in IT, particularly with Kubernetes and DevOps frameworks, and possess solid leadership skills. This position is based in Johannesburg and offers a competitive salary aligned with experience.

Qualifications

  • Minimum of 5 years IT experience, with at least 3 years in a relevant technology or domain.
  • Advanced Diplomas / National 1st Degrees required.

Responsibilities

  • Collaborate with stakeholders to ensure reliability service offerings meet customer needs.
  • Coach squads on SRE practices and implement automated solutions for high availability.

Skills

Kubernetes
Root Cause Analysis
Continuous Improvement
Elasticsearch
Proactivity

Education

Degree or Diploma in IT
Job description

Job Classification

140754 - Technology Domain Specialist (Site Reliability Engineer)

Closing date - 10 July 2025

Job Family

Information Technology

Career Stream

Application Development

Leadership Pipeline

Manage Self : Technical

Job Purpose

To actively own and participate in the overall evolution of the Technology or Domain asset while influencing and maintaining the health of the asset. Play a leadership role on the associated COEs

Job Responsibilities

  • Collaborating with stakeholders engineers and operational SMEs to ensure all relevant parties are up to date with what is top of mind within the reliability service offerings
  • Evolve services based on customer needs and technology to ensure we remain competitive in the market
  • Influence and collaborate with squads during service or platform design to proactively prevent system failures and enhance performance
  • Engage with Asset / Journey squads to adopt SRE practices with a core focus to contribute towards incident management and advocate for blameless post mortems.
  • Engage and influence squads with regards to observability high availability utilising new or existing technology and Improve disaster recovery plans.
  • Implement automated-based solutions to achieve high availability efficiency reduce cost and performance to systems.
  • Coach squads on best practices within the organisation via internal forums to position SRE fundamental knowledge and promote enterprise-wide knowledge sharing
  • Assist with creating and maintaining system health and performance metrics reflecting real-time data enabling proactive resolution and faster troubleshooting.
  • Collaborate and partner with DevOps engineer / coach to ensure efficient (CI / CD) pipelines and resolve any failures or improve.
  • Take charge of technical leadership engage with squads to identify best solutions and support and guide Junior SREs.
  • Assist in defining and implementing metrics related to performance of services such as SLOs SLIs and SLOs.
  • Defining and delivering Site Reliability Engineering technical standards in partnership with all disciplines of software engineering.
  • Participate and closely work with relevant COEs to improve release of new features to facilitate time to market.
  • Ability to build and maintain strategic relationships with the business units and vendors in order to be in sync on current ways of work and business decisions that are being embraced
  • Conduct assessments within squads to measure SRE maturity provide report and outline a plan to assist on moving to next level with continuous feedback.
  • Adhere and comply with Nedbank group information management data integrity and security policies and best practices.
  • Participate and support corporate responsibility initiatives for the achievement of business strategy.
  • Manage multiple concurrent objectives projects groups or activities making effective judgements as to prioritisation and time allocation

Technical Skills

  • Working Experience of Operating System (Linux or Windows)
  • Knowledgeable with microservices and containerization; K8s or Docker
  • Troubleshooting and rout cause Analysis
  • SRE Best practices
  • In-depth knowledge of DevOps framework
  • Experience and knowledge of programming languages(C# Java Python Bash)
  • Proactivity in seeking Improvement opportunities
  • Experience with troubleshooting production systems / applications
  • Essential Qualifications - NQF Level

  • Advanced Diplomas / National 1st Degrees
  • Professional Qualifications / Honours Degree
  • Preferred Qualification

    Degree or Diploma in IT

    Preferred Certifications

    Certificate in relevant Technology or Domain

    Minimum Experience Level

    Min 5IT Experience with 3 years in relevant technology or domain

    Technical / Professional Knowledge

  • Asset management
  • IT Assets management processes
  • Data Warehousing
  • Information Technology (IT) Architecture
  • Behavioural Competencies

  • Decision Making
  • Courage
  • Stress Tolerance
  • Quality Orientation
  • Technical / Professional Knowledge and Skills
  • Emotional Intelligence Essentials
  • Resolving Conflict
  • Please contact the Nedbank Recruiting Team at

    Required Experience :

    Unclear Seniority

    Key Skills

    Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

    Employment Type : Full Time

    Experience : years

    Vacancy : 1

    Get your free, confidential resume review.
    or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.