About the job Monitoring Administrator senior
Job Description:
- Location: Fully remote, Central Europe Time Zone
- Languages: English is mandatory
The primary objective of this role is to design, implement, upgrade, and maintain a robust monitoring infrastructure using SCOM and Checkmk, with complementary capabilities provided by Elastic and Prometheus. The resource will work closely with our DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of our systems.
Duties and Responsibilities:
- Assess current monitoring setup and identify gaps.
- Design, implement, and upgrade SCOM and Checkmk monitoring solutions in an on-premises setup with multi-tenant and multi-support team design.
- Configure and maintain dashboards for real-time visualization with multi-tenant and support team design.
- Develop and document monitoring strategies and best practices.
- Set up alerts and notification mechanisms to preemptively address system issues.
- Train internal staff on the use and maintenance of SCOM and Checkmk.
- Provide ongoing support and improvements to the monitoring framework.
- Ensure high availability and performance of the monitoring and logging systems.
- Provide standby services on a rotation basis during weekends, holidays, and outside of normal working hours.
- Perform other duties as required.
Required Qualifications & Experience
- At least 5 years in a similar role
- Proven experience in deploying and managing SCOM and Checkmk in an on-premises setup with multi-tenant and multi-support team design.
- Strong understanding of monitoring concepts and best practices, including SNMP for network devices monitoring.
- Experience with monitoring-related technologies, Windows Servers, Linux, SQL, IIS, Active Directory, websites, virtualization, on-premises infrastructure, and more.
- Proficiency in scripting and automation (e.g., PowerShell, Bash, Python).
- Familiarity with automation tools, especially Ansible.
- Knowledge of other monitoring tools is highly desirable, especially Elastic, Prometheus, and Grafana.
- Programming skills are desirable, especially .NET C# and Python.
Education and Certifications:
- Bachelor's or master's degree in information technology is desirable.
- Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana are desirable. Linux and/or Windows System Administration.