¡Activa las notificaciones laborales por email!

Site Reliability Engineer (f/m/d)

arsys ES

Logroño, Madrid, Barcelona

Presencial

EUR 70.000 - 90.000

Jornada completa

Hoy

Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A leading technology company is seeking a Site Reliability Engineer (SRE) for its applications team in Logroño, Spain. The role involves contributing to the evolution of product infrastructure, ensuring stable operations, performing in-depth analysis, and driving automation using tools like Terraform and GitLab CI/CD. Candidates should have several years of experience in similar roles, advanced expertise in Linux and Kubernetes, and proficiency in programming languages for automation tasks. Good command of English is required.

Servicios

Access to one day per week for learning and training

Formación

Several years of experience as an SRE or in similar roles.
Proficient in Linux and container technologies.
Good command of English, both spoken and written.

Responsabilidades

Contribute to the evolution of product infrastructure.
Perform in-depth analysis and optimization of environments.
Develop and maintain monitoring solutions.

Conocimientos

Linux expertise

Kubernetes

Terraform

CI/CD pipelines

Automation scripting (Go, Python, Bash)

Monitoring/logging tools (Prometheus, Grafana)

Herramientas

GitLab CI/CD

Helm charts

ELK Stack

We are looking for a Site Reliability Engineer (SRE) in the IONOS Applications team

Tasks

Contribute to the evolution of product infrastructure, integrating new services and applications into our cloud and Kubernetes environment.
Ensure the stable and secure operation of our platforms.
Perform in-depth analysis and optimization of distributed and highly scalable environments.
Drive automation using tools such as Terraform, GitLab CI/CD, and ArgoCD, managing infrastructure declaratively and reproducibly.
Analyze and resolve complex issues in distributed systems, contributing to the continuous improvement of the platform.
Develop and maintain monitoring, logging, and alerting solutions (e.g., Prometheus, Grafana, ELK Stack) to proactively detect bottlenecks and sources of error.
Participate in on-call rotations, one week every 4 to 5 weeks.
Collaborate with product development teams to organize joint projects.
Manage incidents end-to-end: initial analysis, ticket creation, resolution, and follow-up through Problem Management.
Have access to up to one day per week for learning and continuous training.
Several years of experience as an SRE or in similar roles (Linux System Administrator, DevOps Engineer, Platform Engineer, Full Stack Developer).
Advanced expertise in Linux, container technologies, and especially Kubernetes.
Experience with Infrastructure as Code (preferably Terraform), CI/CD pipelines (GitLab CI/CD, GitHub Actions), and Helm charts.
Proficiency in at least one programming or scripting language (Go, Python, Bash) for automation and monitoring tasks.
Experience in operating and troubleshooting high-availability production environments.
Knowledge of monitoring, alerting, and log analysis for distributed applications (Prometheus, Grafana, FluentD, ELK, VictoriaMetrics, Icinga).
A proactive, solution-oriented, and independent working style, with the ability to systematically analyze and sustainably resolve technical problems.
Good command of English (spoken and written).

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.

Ciudades destacadas

Empresas destacadas

Vacantes populares