¡Activa las notificaciones laborales por email!

(AP686) - Lead Sre Engineer

Stuart

Ibiza

A distancia

EUR 60.000 - 80.000

Jornada completa

Hace 3 días

Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A leading logistics company in Ibiza is seeking a Lead Site Reliability Engineer to spearhead technical initiatives, enhance platform reliability, and lead a newly formed SRE team. The ideal candidate will have over 5 years of experience in mission-critical services, a strong background in automation, and a passion for fostering team culture. This role offers flexibility with remote work in Spain and opportunities for personal development.

Servicios

Family-friendly work-life balance

Flexibility with remote work

Ticket Restaurant benefit (€11 daily)

Unlimited Udemy access

Formación

5+ years of experience in a similar role.
Passion for automation to eliminate repetitive tasks.
Proven experience leading complex projects.

Responsabilidades

Leading the team as the go-to expert on software reliability.
Participating in hiring and fostering team culture.
Designing and implementing Stuart’s observability stack.

Conocimientos

Experience in mission-critical services

Background in Systems or Software Engineering

Expertise in troubleshooting Linux and networking issues

Strong knowledge of AWS, EKS, Kubernetes

Experience with chaos engineering practices

Fluency in English

Herramientas

Terraform

Stuart (DPD Group) is a sustainable last-mile logistics company connecting retailers and e-merchants to a fleet of geolocalised couriers across several countries in Europe.

Our Mission

We are an impact-driven company aiming to build a more sustainable future for logistics: shared, efficient, and reliable. We strive to set new standards for urban deliveries that address environmental and social challenges while providing a premium delivery experience that is fast, flexible, and convenient.

Our motto : “Make every delivery a moment all of us can truly celebrate!” Over 3000 leading brands across Restaurants, Grocery, Retail & Luxury, eCommerce, and Professional Services partner with us to deliver goods seamlessly. Stuart is a diverse and inclusive company with 700+ employees from 90+ nationalities working across France, Italy, Poland, Portugal, Spain, and the U.K.

With the surge in home delivery services, now is the perfect time for us to make a significant impact. You can help us realize this vision.

We are looking for a

Lead Site Reliability Engineer

to be a technical leader for our SRE team, guiding the team technically and enhancing our platform’s robustness, failure handling, and early issue detection through automation, proper alarming, and chaos engineering.

The SRE mission

is to maximize platform reliability by reducing incidents and their severity. This involves monitoring services effectively, setting meaningful alarm thresholds, and automating remediation tasks.

Reliability is further strengthened by introducing controlled errors (chaos engineering) and testing disaster recovery scenarios. SREs serve as stewards of reliability, providing the necessary technical and documentation tools for other engineering teams.

The SRE team

is newly formed at Stuart, offering you the opportunity to influence its growth. You will be part of the Infrastructure department’s Reliability area, alongside the Engineering Support team, Cloud Engineering, Security, and IT.

What will I be doing?

Leading the team as the go-to expert on software reliability.

Participating in hiring, community talks, defining team processes, and fostering team culture and growth.

Helping engineering teams build reliable, observable, and high-performance products.

Driving and assisting other teams in setting and tracking SLOs and SLAs via SLIs.

Designing, implementing, and guiding adoption of Stuart’s observability stack.

Contributing to system reliability and performance improvements.

Writing and automating playbooks for alarms to minimize manual intervention.

Documenting best practices and knowledge sharing.

Collaborating on incident management with the Engineering Support team.

Leading post-mortem analyses and follow-up actions.

Advancing chaos engineering initiatives.

What do we need from you?

5+ years of experience in a similar role within mission-critical, always-up services.

Background in Systems or Software Engineering.

Passion for automation to eliminate repetitive tasks.

Proven experience leading complex projects.

Expertise in troubleshooting Linux and networking issues.

Experience with complex Terraform codebases; bonus if you have written a provider.

Strong knowledge of AWS, EKS, Kubernetes, and cloud environments.

Experience with chaos engineering practices.

Enjoyment in teaching, documenting, and sharing best practices.

Proactive attitude to identify and resolve issues.

Fluency in English, both written and spoken.

We understand you may not meet every criterion but sharing this gives you an idea of our ideal candidate profile.

The stuff you wanna know

Family-friendly work-life balance with remote work and flexible hours.

Option to work remotely anywhere in Spain.

Ticket Restaurant benefit (€11 daily).

Unlimited Udemy access for learning and development.

Stuart Academy with regular workshops and classes.

El anuncio original lo puedes encontrar en Kit Empleo :

J-18808-Ljbffr

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.

(AP686) - Lead Sre Engineer

Stuart

Ibiza

A distancia

EUR 60.000 - 80.000

Jornada completa

Descripción de la vacante

Servicios

Formación

Responsabilidades

Conocimientos

Herramientas

Descripción del empleo

Empresa

Servicios

Recursos gratuitos

Ayuda

(AP686) - Lead Sre Engineer

Stuart

Ibiza

A distancia

EUR 60.000 - 80.000

Jornada completa

Descripción de la vacante

Servicios

Formación

Responsabilidades

Conocimientos

Herramientas

Descripción del empleo

Síguenos

Empresa

Servicios

Recursos gratuitos

Ayuda