Enable job alerts via email!

Monitoring Administrator Senior

Blackfluo.ai

Cape Town

On-site

ZAR 600,000 - 900,000

Full time

28 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in the technology sector is seeking a Senior Monitoring Administrator to design and maintain a robust monitoring infrastructure. The successful candidate will work closely with DevOps and IT teams to ensure effective monitoring and alerting across systems. This fully remote position requires strong technical skills in SCOM, Checkmk, and related technologies, along with a minimum of 5 years' experience in a similar role.

Qualifications

  • At least 5 years in a similar role.
  • Proven experience with SCOM and Checkmk in multi-tenant setups.
  • Strong understanding of monitoring best practices.

Responsibilities

  • Design and implement monitoring solutions using SCOM and Checkmk.
  • Configure dashboards for real-time visualization.
  • Train staff on monitoring tools and strategies.

Skills

Monitoring concepts
Scripting
Automation
Problem-solving

Education

Bachelor's or master's degree in information technology
Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana
Linux and/or Windows System Administration certifications

Tools

SCOM
Checkmk
Elastic
Prometheus
Grafana
Ansible

Job description

Join to apply for the Monitoring Administrator Senior role at Blackfluo.ai.

Location : Fully remote, Central Europe Time Zone

Languages : English is mandatory

Job Description

The primary objective of this role is to design, implement, upgrade, and maintain a robust monitoring infrastructure using SCOM and Checkmk, with complementary capabilities provided by Elastic and Prometheus.

The resource will work closely with our DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of our systems.

Duties and Responsibilities

  1. Assess current monitoring setup and identify gaps.
  2. Design, implement, and upgrade SCOM and Checkmk monitoring solutions in an on-premises setup with multi-tenant and multi-support teams design.
  3. Configure and maintain dashboards for real-time visualization with multi-tenant and support team considerations.
  4. Develop and document monitoring strategies and best practices.
  5. Set up alerts and notification mechanisms to preemptively address system issues.
  6. Train internal staff on the use and maintenance of SCOM and Checkmk.
  7. Provide ongoing support and improvements to the monitoring framework.
  8. Ensure high availability and performance of the monitoring and logging systems.
  9. Provide standby services on a rotation basis during weekends, holidays, and outside normal working hours.
  10. Perform other duties as required.

Qualifications & Experience

  1. At least 5 years in a similar role.
  2. Proven experience in deploying and managing SCOM and Checkmk in an on-premises setup with multi-tenant and multi-support teams design.
  3. Strong understanding of monitoring concepts and best practices, including SNMP for network devices monitoring.
  4. Experience with monitoring-related technologies: Windows Servers, Linux, SQL, IIS, Active Directory, web hosting, virtualization, on-premises infrastructure, and more.
  5. Proficiency in scripting and automation (e.g., PowerShell, Bash, Python).
  6. Familiarity with automation tools, especially Ansible.
  7. Knowledge of other monitoring tools like Elastic, Prometheus, and Grafana is highly desirable.
  8. Programming skills in .NET C# and Python are a plus.

Education & Certifications

  1. Bachelor's or master's degree in information technology is desirable.
  2. Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana are desirable.
  3. Linux and / or Windows System Administration certifications.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.