
Enable job alerts via email!
Generate a tailored resume in minutes
Land an interview and earn more. Learn more
A leading technology firm in Kuala Lumpur is seeking a Site Reliability Engineer as part of their Fresh Graduate Program. The role involves managing cloud services, automating tasks to enhance efficiency, and ensuring system reliability through various engineering best practices. Candidates should hold a bachelor's degree in computer science or a related field, with familiarity in technologies such as Docker, Kubernetes, and monitoring tools like Splunk or Grafana. This position provides an excellent opportunity for fresh graduates aiming for a robust career in ICT development.
Fresh Graduate Program of Huawei Malaysia is ongoing. This program offers outstanding local talents fixed-term contracts for an accelerated career, to boost their career while participating in multiple countries' ICT development.
Fresh graduate who has graduated and is going to graduate in year 2025 and 2026.
Handle SRE role for assigned cloud services owning the KPIs for service reliability, issue to resolution, service deployment, business continuity management, security policy planning, capacity planning, Automation,etc.
Automation: Automate routine and manual operations tasks to reduce "toil" and improve efficiency.
Monitoring & Alerting: Implement and use monitoring systems to track system health, set up alerting, and create dashboards.
Incident Management: Respond to and manage incidents to minimize downtime and resolve issues quickly, including on-call support.
System Performance: Measure, analyze, and tune system performance to ensure efficiency and stability.
Infrastructure Management: Provision and manage cloud infrastructure, sometimes using Infrastructure as Code (Iac), and assist in platform management and capacity planning.
Reliability & Resilience: Build sustainable and reliable systems through software engineering practices, which can include resilience testing and chaos engineering.
Full-time bachelor’s degree or above (or equivalent) in computer science or related discipline.
Be familiar with containerization technologies like Docker and orchestration tools like Kubernetes.
Be familiar with configuration management and automation tools such as Ansible and Terraform.
Be familiar with monitoring, logging, and alerting tools like Splunk, Grafana, or Prometheus.
Have good communication skills, contingency skills, organization and coordination skills, and strong analytical and troubleshooting skills for complex systems.