Role: Lead a team of The Enterprise Observability Engineer design, build and operate Enterprise Observability Platform to ensure monitoring and key observability aspect of applications are covered.
Responsibilities:
Drive the implementation and management of Enterprise Observability platform by enabling observability capabilities for IT operation.
Plan, design, build and manage Enterprise Observability platform based on the latest innovation to cover services running multi-cloud environment.
Collaborate and work with the various stakeholders to develop technical solutions, plans and configuration for the observability capability and develop run book automation with the objective to achieve self-healing capability.
Perform integration and develop full interoperability capabilities with various operations management platforms including change management, service management, privileged access management systems, etc.
Provide support to existing monitoring systems and bridge the transition to an AI-enabled observability platform.
Requirements:
Possesses at least a Bachelor's Degree in Computer Science/Information Technology or equivalent.
Solid experience in managing large scale infrastructure services on private and public cloud.
Minimum 5 years hands-on experience in designing and building Enterprise Observability and AIOps platform.
At least 5 years of hands-on working experience in:
Enterprise monitoring systems (such as AppDynamic, Nagios, Thousand Eyes, Solarwind, Broadcom, Prognosis).
Automation scripting (such as Ansible, Terraform, Powershell, etc).
Good understanding of cloud monitoring tools on AWS, Azure and GCP.
Demonstrate strong knowledge in both application and infrastructure domains with the ability to develop automation scripts.
Possess good technical knowledge in implementing, troubleshooting, and performance tuning of hardware, operating systems, and system services.
Strong analytical skills, creativity, and out-of-the-box thinking in problem-solving.
Excellent command of written and spoken English.
A good team player and able to work effectively at all levels of an organization with the ability to influence others to move towards consensus.
Proven ability to operate under pressure and meet challenging deadlines with minimum supervision.