Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer

Hirexa Solutions

Köln

Remote

EUR 80.000 - 100.000

Vollzeit

Heute
Sei unter den ersten Bewerbenden

Zusammenfassung

A leading technology recruitment firm is seeking a Site Reliability Engineer to manage platform operations and enhance system stability. This remote role, based in Germany, requires strong skills in Kubernetes, Prometheus, and automation scripting. Ideal candidates will have a solid background in observability and be comfortable with 24x7 operational support. Applicants must reside in Germany and possess Ü2 security clearance.

Qualifikationen

  • Strong grasp of Linux concepts, preferably in Kubernetes environments.
  • Solid understanding of networking fundamentals and REST APIs.
  • Proficiency in Python, Go, or Bash.

Aufgaben

  • Manage Kubernetes and container orchestration including Helm chart configurations.
  • Maintain Prometheus solutions and administer Thanos and Grafana.
  • Configure and optimise Elasticsearch clusters.

Kenntnisse

Observability
Network
Kubernetes
Python
Git

Ausbildung

Elastic Certified Engineer
LPIC Level 2
Kubernetes Administrator

Tools

Elasticsearch
Prometheus
Grafana
CI/CD (Jenkins, ArgoCD)
Jobbeschreibung
Overview

Job Title : Site Reliability Engineer

Location : Germany (Remote)

Employment Type : FTE / FTC

About Hirexa Solutions :

Hirexa Solutions is a leading player in the recruitment ecosystem across the United States, United Kingdom, Europe, and India. As the fastest-growing next-generation provider of technology talent, we empower our clients to become resourceful, achieve higher productivity, adopt agile structures, and effectively execute project deliverables.

Envisioned and co-founded by veterans of the Information Technology industry, our mission is to make recruitment efficient, flawless, and cost-effective. Our unwavering commitment to strategic investments in intelligent technology underscores our passion for people and our dedication to helping organizations realize their true potential.

About the Role :

We are seeking a Site Reliability Engineer (SRE) with a strong background in observability, secure logging, and automation. The ideal candidate will have hands-on experience with Elasticsearch and / or Prometheus platforms. This role encompasses critical responsibilities in platform operations, including incident management, execution of scheduled maintenance, and contributing to engineering tasks focused on enhancing system stability. The SRE will also be responsible for adhering to standard operating procedures (SOPs) and actively contributing to their continuous improvement by providing constructive feedback.

Mandate conditions
  • Skill required : Observability- Network, Open Observability, SNMP protocol, SSH, Prometheus, Visuvalition- Grafana, CICD- Gitub, Cluster management, Private Cloud, Kubernetes Cluster, Alert management, Operation- Logstack, Troubleshooting, Repository, Kerl command- DNS,IP address range, TCP connection, Linux
  • Should Live in Germany / Open to relocate and should be in Germany at the time of joining
  • Ü2 security clearance – should be comfortable to undergo this process
  • 24x7 Operational Support
Key Responsibilities
  • Platform Engineering & DevOps : Manage Kubernetes and container orchestration, including Helm chart configurations and CI / CD pipelines (Jenkins, ArgoCD). Develop automation scripts (Python, Bash, Go) and deploy Infrastructure-as-Code (IaC) solutions.
  • Observability, Monitoring & Visualisation : Maintain Prometheus solutions (scrape configurations, alert rules, PromQL queries), administer Thanos and Grafana.
  • Elastic Stack Operations & Log Management : Configure and optimise Elasticsearch clusters, Logstash pipelines, and Kibana dashboards for secure, scalable log processing.
  • Incident Response, Troubleshooting & Collaboration : Participate in 24x7 on-call rotations for rapid incident response, troubleshoot platform, data and performance issues, and engage in Major Incident Management (MIM).
  • Secure Operations & Compliance : Ensure system operations meet security and data protection requirements, maintain secure documentation, and manage access control policies.
Qualifications, Requirements, and Skills
  • Strong grasp of Linux concepts, preferably in Kubernetes environments.
  • Solid understanding of networking fundamentals and REST APIs.
  • Proficiency in Python, Go, or Bash.
  • Proficiency in Git-based configuration management workflows.
  • Familiarity with CI / CD tools like Helm, Jenkins, or ArgoCD.
  • Experience with Elasticsearch and / or OpenSearch.
  • Fluent English communication skills.
  • Willingness to work shift-based 24x7 on-call support, including weekends and holidays.
  • Must possess Ü2 security clearance.
  • Citizenship required : Member state of EU and NATO. No dual citizenship outside these countries.
  • Must reside in Germany and hold a German labor contract.
  • Preferred Certifications : Elastic Certified Engineer, LPIC Level 2, Kubernetes Administrator.
How to Apply

If you are interested in this opportunity, please submit your resume. We look forward to hearing from you!

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.