Enable job alerts via email!

Site Reliability Engineer

Flash Group

Wes-Kaap

On-site

ZAR 400 000 - 500 000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in South Africa is seeking a skilled individual for a permanent role focusing on Site Reliability Engineering (SRE) and DevOps. You will leverage your expertise in automation and scripting to manage systems effectively, lead incident responses, and shape architectural decisions. The ideal candidate should have 8-10 years of relevant experience and strong leadership skills, fostering team development in a cutting-edge environment.

Qualifications

8-10 years of relevant experience in SRE, DevOps, or system engineering.
Strong proficiency in scripting languages.
Possesses relevant certifications.

Responsibilities

Master multiple scripting and programming languages for robust solutions.
Drive the design and implementation of automation tools.
Lead incident responses and perform post-incident reviews.
Shape system architecture and influence decisions.
Ensure adherence to reliability standards.

Skills

Scripting languages proficiency

Cloud skills & best practices

Automation

Incident and outage management

Capacity planning

Education

Relevant certification (e.g., Oracle, Cloud, DevOps)

Tools

Azure DevOps

Containers

Configuration management

Flash

2024/12/12 Western Cape

Job Reference Number: T169

Department: Technology

Business Unit:

Industry: Fintech

Job Type: Permanent

Positions Available: 3

Salary: Market Related

We are looking for an individual passionate about technology and experience in developing and managing cutting-edge environment monitoring solutions, as well as using software and automation to solve problems and manage production systems.

Job Description

RESPONSIBILITIES:

Master multiple scripting and programming languages to achieve advanced proficiency and deliver robust solutions.
Drive the design and implementation of sophisticated automation tools and processes for managing large-scale systems.
Lead critical incident responses with composure and efficiency, followed by thorough post-incident reviews to implement preventative measures.
Shape system architecture and design, bringing your vision and expertise to influence high-impact decisions.
Champion the creation and adherence to reliability standards, ensuring scalable and sustainable system operations.
Demonstrate strong strategic thinking and planning abilities to drive team and organizational success.
Exhibit exceptional leadership skills, with the capacity to influence key technical decisions and inspire cross-functional teams.
Possess mentorship and coaching expertise to nurture and develop junior and intermediate team members, fostering a collaborative and growth-oriented environment

Job Requirements

MINIMUM REQUIREMENTS:

8-10years relevant experience in SRE, DevOps, or system engineering Matric
Proficiency in scripting languages
Relevant certification such as Oracle, Cloud,Dev Ops

TECHNICAL SKILLS:

Continuous delivery
Cloud skills & best practices
Observability (System and Application Performance Monitoring)
Infrastructure as code
Configuration management (Infrastructure as a Service)
Containers
Automation
Collaboration and Communication
Coding and Scripting
Azure DevOps
General systems uptimes
SLO (Service-level Objectives)
Latency
Incident and outage management
Change management
Capacity planning

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.