Activez les alertes d’offres d’emploi par e-mail !

Observability Administrator

Blackfluo.ai

Paris

À distance

EUR 50 000 - 70 000

Plein temps

Aujourd’hui
Soyez parmi les premiers à postuler

Résumé du poste

A technology firm is seeking an experienced Monitoring Engineer to design and maintain robust observability infrastructure using Elastic, Prometheus, and Grafana. The role involves configuring monitoring solutions, integrating systems, and training staff. Ideal candidates will have 5+ years in a similar role and proficiency in scripting. This position is fully remote, with working hours aligned to Central Europe Time Zone.

Qualifications

  • At least 5 years in a similar role.
  • Proven experience in deploying and managing Elastic, Prometheus, and Grafana.
  • Proficiency in scripting and automation (e.g., Bash, Python).

Responsabilités

  • Assess current monitoring and observability setup and identify gaps.
  • Design and upgrade Prometheus-based monitoring solutions.
  • Set up alerts and notification mechanisms for system issues.

Connaissances

Elastic
Prometheus
Grafana
Scripting
DevOps
Kubernetes
Docker

Formation

Bachelor's or master's degree in information technology

Outils

SCOM
Checkmk
Infrastructure-as-code tools
Description du poste
Overview

The primary objective of this role is to design, implement, upgrade and maintain a robust observability infrastructure using Elastic, Prometheus and Grafana, with complementary capabilities provided by SCOM and Checkmk. The resource will work closely with our DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of our systems. The resource should have advanced experience in complex enterprise environments. Canonical Observability Stack (COS) will be used, therefore advanced experience in COS would be ideal.

Location and Language
  • Location: Fully remote, Central Europe Time Zone
  • Languages: English is mandatory
Duties and Responsibilities
  • Assess current monitoring and observability setup and identify gaps.
  • Design, implement and upgrade Prometheus-based monitoring solutions in on-premises setup with multi-tenant and several support teams design.
  • Configure and maintain Grafana dashboards for real-time visualization with multi-tenant and several support teams design.
  • Integrate Prometheus with other systems and tools (e.g., Loki, Mimir, Tempo, Thanos).
  • Design, implement and upgrade Elastic (ELK Stack) for on-premises setups.
  • Develop and document monitoring and logging strategies and best practices.
  • Set up alerts and notification mechanisms to preemptively address system issues.
  • Train internal staff on the use and maintenance of Prometheus, Grafana, and Elastic.
  • Provide ongoing support and improvements to the observability framework.
  • Ensure high availability and performance of the monitoring and logging systems.
  • Provide stand-by services on a rotation basis during weekends, holidays and outside of normal working hours.
  • Perform other duties as required.
Required Qualifications & Experience
  • At least 5 years in a similar role
  • Proven experience in deploying and managing Elastic, Prometheus and Grafana in on-premises setup with multi-tenant and multi-support teams design.
  • Strong understanding of observability concepts and best practices, including APM.
  • Experience with related technologies (e.g., Kubernetes, Docker, Kibana, Mimir, Loki, Tempo, Thanos, on-premises infrastructure).
  • Proficiency in scripting and automation (e.g., Bash, Python).
  • DevOps experience and practice.
  • Familiarity with infrastructure-as-code tools (e.g., Ansible, Terraform).
  • Experience with log management and tracing solutions (e.g., Loki, ELK stack, Jaeger).
  • Knowledge of other monitoring tools is desirable, especially SCOM and Checkmk.
  • Programming skills is desirable, especially .NET C# and Python.
Education and Certifications
  • Bachelor's or master's degree in information technology is desirable.
  • Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana is desirable. Linux and/or Windows System Administration
  • Network Administration
Obtenez votre examen gratuit et confidentiel de votre CV.
ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.