Responsibilities
- Participate in platform software engineering, writing code to reduce human intervention and automate operational tasks.
- Lead in‑depth technical and data analysis to monitor service trends and drive continuous improvement.
- Prioritize reliability features and design, develop, and deliver effective tooling, alerts, and automated responses to mitigate reliability risks.
- Communicate reliability, stability, and efficiency results (based on SLOs), service health dashboards, key reliability risks, and incidents to senior stakeholders, prioritizing activity and directing investment and action.
- Automate installation and maintenance of test/development servers, release builds, and deployment of existing tools and dependent solutions.
- Design and take ownership of innovations that improve software engineering velocity, infrastructure resiliency, and security, evaluating new application packages and tools and researching best practices.
- Debug and diagnose the root cause of infrastructure problems for ongoing operation, reviewing, verifying, and validating software code developed in DevOps projects.
Qualifications
- 6+ years of experience in software development and DevSecOps/SRE functions, with at least two years in a senior technical capacity.
- Experienced System Engineer, SysAdmin, or Software Engineer with professional Linux skills, distributed system development, and a passion for automating API‑driven tasks at scale.
- Strong programming skills in Java, C/C++, NodeJS, Python, or Go, with experience building and deploying software products and overcoming challenges in distributed systems.
- Clear communication of incident status via email in a business‑friendly tone; advanced understanding of observability tools (ELK, Grafana/Prometheus, Zabbix, Nagios, etc.) and CI/CD and release management solutions.
- Broad knowledge of OS platforms (Linux/UNIX), networking, web systems, and DevSecOps; experience with large‑scale distributed systems and microservices architecture concepts.
- Experience with containers, CD tools (Pulumi, Docker, Ansible, Puppet, etc.), and integration/build tools (Jenkins, Groovy, Maven, Atlassian Suite, GitLab CI).
Benefits
- Attractive remuneration package and great career progression.
- Permanent role with opportunities for career growth and skill development.
- Work in a large, innovative organization recognized for its technology leadership.
About the Company
Our organization specializes in lawful intelligence technology, providing end‑to‑end monitoring and intelligence solutions to governments worldwide. This is an opportunity to work with a well‑established, large organization and be part of a collaborative, technology‑driven team in Kuala Lumpur.