Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
An innovative online travel agency is seeking a Site Reliability Engineer to enhance their SRE practices and ensure system reliability. In this pivotal role, you will promote best practices, improve performance testing, and develop tools to support a high-load environment. The company values observability and scalability, leveraging cutting-edge cloud technologies. With a commitment to employee development and well-being, this position offers a supportive environment for personal and professional growth. Join a forward-thinking team dedicated to transforming the travel industry through technology.
We are a rapidly growing online travel agency with technology at the core of our success. In 2022, we facilitated millions of people on their dream holidays.
Handling a million visitors daily, our platform supports over 100 services, processing 8,000 requests per second, with a p95 search latency of 150ms. Our observability infrastructure captures and processes 1TB of logs daily and 350,000 metric samples per second.
We emphasize differentiation through open source contributions, including open sourcing internal tools, contributing to public repositories, and sponsoring conferences.
As our first Site Reliability Engineer, you will help evolve SRE practices such as incident management, blameless postmortems, SLOs, and error budgets. Your role will involve building reliable, performant, auto-scalable, and highly available systems with support from the existing Platform Infrastructure team.
Our engineering teams manage the entire lifecycle of services from initial development to high-load production operation. Your responsibility is to enable engineering teams to succeed in operations, not to run their services for them.
We focus heavily on observability, continuously evolving our monitoring and alerting stack centered around the Mimir ecosystem (Prometheus, Grafana, Loki, Tempo). Our service mesh (Linkerd) provides uniform observability of all production services at 10-second intervals.
Performance and scalability are fundamental to our development process, achieved by combining core computer science principles with cutting-edge cloud technologies.
loveholidays offers a personalized approach to searching for your next getaway, allowing you to customize your holiday with maximum flexibility. Rest assured, your holiday is ATOL protected. We offer various payment options to ensure a seamless booking experience.