Aktiviere Job-Benachrichtigungen per E-Mail!

Senior Site Reliability Engineer - Databases (Remote, Germany)

Grafana Labs

München

Remote

EUR 91.000 - 115.000

Vollzeit

Gestern
Sei unter den ersten Bewerbenden

Erhöhe deine Chancen auf ein Interview

Erstelle einen auf die Position zugeschnittenen Lebenslauf, um deine Erfolgsquote zu erhöhen.

Zusammenfassung

A leading company in cloud solutions is seeking a Senior Site Reliability Engineer to enhance the reliability of their cloud databases. This remote role offers the opportunity to work with high-value customers, managing configurations, developing new features, and ensuring optimal performance across platforms like AWS, GCP, and Azure. Ideal candidates will have extensive SRE experience and strong skills in Kubernetes and cloud infrastructure.

Leistungen

Equity
Bonuses

Qualifikationen

  • At least 6 years of engineering experience, with 3+ years in SRE roles.
  • Experience with various cloud services and infrastructure-focused programming.
  • Ability to work autonomously and communicate effectively.

Aufgaben

  • Enhance reliability of cloud databases and manage software configurations.
  • Conduct regular reviews and create SLOs while improving monitoring and automation.
  • Collaborate on product strategy and engage in incident response.

Kenntnisse

Kubernetes
Cloud Storage
Troubleshooting
Incident Response
Distributed Computing
Communication

Tools

Helm
AWS
GCP
Azure

Jobbeschreibung

Senior Site Reliability Engineer - Databases

This is a remote position, considering candidates in Spain, Sweden, the UK, and Germany.

About the role:

We are seeking a Senior SRE to support our high-value Grafana Cloud customers by enhancing the reliability of our cloud databases based on Mimir, Loki, Tempo, and Pyroscope. These SaaS databases are hosted on AWS, GCP, and Azure across various regions.

The SRE team is a new addition within the Databases department, responsible for the environments of our largest customers and acting as an overlay to existing database teams. As part of this team, you will manage software configurations, contribute to new feature development, oversee releases, and ensure they meet SLOs without degrading user experience. Your role involves designing, reviewing, and improving the reliability, observability, and customer use of our systems.

This role involves an on-call component, shared across the team, to ensure a healthy on-call experience aligned with daylight hours. We hire globally to support this model.

What we seek:
  • At least 6 years of engineering experience, with 3+ years in SRE roles
  • Experience as a reliability/production engineer, infrastructure/systems engineer, or software engineer with an infrastructure focus
  • Strong communication skills for technical discussions and cross-team collaboration
  • Experience with Kubernetes on AWS, GCP, or Azure, and with Helm charts or other IaC tools
  • Knowledge of Site Reliability Engineering, distributed computing, and related areas
  • Proficiency in programming languages such as Go, Python, Java, etc.
  • Understanding of Linux internals, networking, cloud storage, and scaling
  • Excellent troubleshooting skills
  • Experience with incident response, post-incident reviews, and proactive problem solving
  • Ability to work autonomously within a team environment
  • Valued qualities include curiosity, transparency, action bias, and kindness
Your day-to-day will include:
  • Conducting regular 1:1 meetings
  • Reviewing and creating SLOs, reducing costs, and improving monitoring and automation
  • Enhancing observability of customer environments
  • Designing solutions for reliability and scalability
  • Developing fault-tolerant patterns
  • Collaborating on product strategy and technical design
  • Participating in code reviews and design discussions
  • Sharing knowledge about SRE best practices
  • Engaging in incident response, investigation, and communication with customers

In Germany, the salary range is €91,464 - €114,330, with benefits including equity and bonuses. Compensation varies by location, experience, and skills.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.