Join to apply for the Monitoring Administrator Senior role at Blackfluo.ai.
Location : Fully remote, Central Europe Time Zone
Languages : English is mandatory
Job Description
The primary objective of this role is to design, implement, upgrade, and maintain a robust monitoring infrastructure using SCOM and Checkmk, with complementary capabilities provided by Elastic and Prometheus.
The resource will work closely with our DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of our systems.
Duties and Responsibilities
- Assess current monitoring setup and identify gaps.
- Design, implement, and upgrade SCOM and Checkmk monitoring solutions in an on-premises setup with multi-tenant and multi-support teams design.
- Configure and maintain dashboards for real-time visualization with multi-tenant and support team considerations.
- Develop and document monitoring strategies and best practices.
- Set up alerts and notification mechanisms to preemptively address system issues.
- Train internal staff on the use and maintenance of SCOM and Checkmk.
- Provide ongoing support and improvements to the monitoring framework.
- Ensure high availability and performance of the monitoring and logging systems.
- Provide standby services on a rotation basis during weekends, holidays, and outside normal working hours.
- Perform other duties as required.
Qualifications & Experience
- At least 5 years in a similar role.
- Proven experience in deploying and managing SCOM and Checkmk in an on-premises setup with multi-tenant and multi-support teams design.
- Strong understanding of monitoring concepts and best practices, including SNMP for network devices monitoring.
- Experience with monitoring-related technologies: Windows Servers, Linux, SQL, IIS, Active Directory, web hosting, virtualization, on-premises infrastructure, and more.
- Proficiency in scripting and automation (e.g., PowerShell, Bash, Python).
- Familiarity with automation tools, especially Ansible.
- Knowledge of other monitoring tools like Elastic, Prometheus, and Grafana is highly desirable.
- Programming skills in .NET C# and Python are a plus.
Education & Certifications
- Bachelor's or master's degree in information technology is desirable.
- Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana are desirable.
- Linux and / or Windows System Administration certifications.