Join to apply for the Monitoring Administrator Senior role at Blackfluo.ai .
Get AI-powered advice on this job and more exclusive features.
- Location : Fully remote, Central Europe Time Zone
- Languages : English is mandatory
Job Description
The primary objective of this role is to design, implement, upgrade, and maintain a robust monitoring infrastructure using SCOM and Checkmk, with complementary capabilities provided by Elastic and Prometheus. The resource will work closely with our DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of our systems.
Duties and Responsibilities
- Assess current monitoring setup and identify gaps.
- Design, implement, and upgrade SCOM and Checkmk monitoring solutions in an on-premises setup with multi-tenant and multi-support teams design.
- Configure and maintain dashboards for real-time visualization with multi-tenant and support teams design.
- Develop and document monitoring strategies and best practices.
- Set up alerts and notification mechanisms to preemptively address system issues.
- Train internal staff on the use and maintenance of SCOM and Checkmk.
- Provide ongoing support and improvements to the monitoring framework.
- Ensure high availability and performance of the monitoring and logging systems.
- Provide standby services on a rotation basis during weekends, holidays, and outside of normal working hours.
- Perform other duties as required.
Required Qualifications & Experience
- At least 5 years in a similar role.
- Proven experience in deploying and managing SCOM and Checkmk in an on-premises setup with multi-tenant and multi-support teams design.
- Strong understanding of monitoring concepts and best practices, including SNMP for network devices monitoring.
- Experience with monitoring-related technologies, Windows Servers, Linux, SQL, IIS, Active Directory, web sites, virtualization, on-premises infrastructure, and more.
- Proficiency in scripting and automation (e.g., PowerShell, Bash, Python).
- Familiarity with automation tools, especially Ansible.
- Knowledge of other monitoring tools, especially Elastic, Prometheus, and Grafana, is highly desirable.
- Programming skills in .NET C# and Python are desirable.
Education and Certifications
- Bachelor's or master's degree in information technology is desirable.
- Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana are desirable.
- Linux and / or Windows System Administration certifications are desirable.
Additional Information
- Seniority level : Not Applicable
- Employment type : Full-time
- Job function : Information Technology
- Industry : Software Development
J-18808-Ljbffr