Job Search and Career Advice Platform

Enable job alerts via email!

Site Reliability Engineer

TechTiera

Kuala Lumpur

On-site

MYR 60,000 - 100,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology company in Kuala Lumpur is seeking a skilled individual to collaborate with developers on designing and deploying secure infrastructure. The role involves monitoring system performance, automating operational tasks, and responding to incidents as part of an on-call rotation. Ideal candidates will have a Bachelor's in Computer Science, strong programming skills in languages like Python or Java, and experience with cloud platforms and containers. Competitive compensation packages are offered.

Responsibilities

  • Collaborate with developers to design, build, and deploy highly available, scalable, and secure infrastructure.
  • Automate and streamline operational tasks to improve efficiency and reliability.
  • Continuously monitor system performance and proactively address issues to ensure optimal uptime.
  • Participate in on-call rotations and respond to incidents to quickly resolve problems.
  • Implement and maintain robust backup and disaster recovery strategies.
  • Document and share best practices, processes, and knowledge with the team.
  • Provide technical guidance and support to cross-functional teams.

Skills

Strong programming skills in Python, Java, Go, or similar languages
Experience with distributed systems
Hands-on experience with Linux/Unix systems
Familiarity with cloud platforms (AWS, Azure, or GCP)
Experience with containers and orchestration (Docker, Kubernetes)
Knowledge of CI/CD tools (GitHub Actions, Jenkins, GitLab CI)
Understanding of monitoring, alerting, and incident management
Strong debugging and performance-tuning skills
Background in security best practices (IAM, secrets management)

Education

Bachelor’s degree in Computer Science or equivalent practical experience
Job description

Collaborate with developers to design, build, and deploy highly available, scalable, and secure infrastructure

Automate and streamline operational tasks to improve efficiency and reliability

Continuously monitor system performance and proactively address issues to ensure optimal uptime

Participate in on-call rotations and respond to incidents to quickly resolve problems

Implement and maintain robust backup and disaster recovery strategies

Document and share best practices, processes, and knowledge with the team

Provide technical guidance and support to cross-functional teams

Requirements:

Bachelor’s degree in Computer Science or equivalent practical experience

Strong programming skills in Python, Java, Go, or similar languages

Experience with distributed systems and microservices

Hands-on experience with Linux/Unix systems

Familiarity with cloud platforms(AWS, Azure, or GCP)

Experience with containers and orchestration(Docker, Kubernetes)

Knowledge of CI/CD tools(GitHub Actions, Jenkins, GitLab CI, etc.)

Experience managing production system

Understanding of monitoring, alerting, and incident management

Familiarity with SLO/SLI frameworks

Strong debugging and performance-tuning skills

Background in security best practices (IAM, secrets management)

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.