Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer (all genders)

TN Germany

Deutschland

Vor Ort

EUR 70.000 - 90.000

Vollzeit

Vor 3 Tagen
Sei unter den ersten Bewerbenden

Erhöhe deine Chancen auf ein Interview

Erstelle einen auf die Position zugeschnittenen Lebenslauf, um deine Erfolgsquote zu erhöhen.

Zusammenfassung

A leading company in children's interactive audio platforms seeks a Site Reliability Engineer to ensure the reliability and performance of their on-premise and cloud systems. Responsibilities include managing infrastructures, optimizing services, and collaborating with teams to enhance system reliability. The ideal candidate will have extensive experience in Reliability Engineering and proficiency in programming and cloud services.

Leistungen

Deutschlandticket
Monthly parking contribution
Leasing bicycle
Remote work subsidy
30 days paid annual leave
10 unpaid leave days
Flexible working options
Training opportunities
Language learning app
Discounts on products

Qualifikationen

  • 5+ years of experience in Reliability Engineering.
  • Experience with AWS cloud services.
  • Hands-on experience with CI/CD pipelines.

Aufgaben

  • Design, deploy and manage on-premise and cloud infrastructures.
  • Streamline and automate deployment processes.
  • Lead incident response efforts and conduct post-incident reviews.

Kenntnisse

Python
Go
Rust
Linux
Problem Solving
Communication

Ausbildung

Bachelor's degree

Tools

Terraform
Ansible
Docker
Kubernetes
Gitlab CI/CD

Jobbeschreibung

Social network you want to login/join with:

Site Reliability Engineer (all genders), germany

col-narrow-left

Client:
Location:

germany, Germany

Job Category:

-

EU work permit required:

Yes

col-narrow-right

Job Reference:

c1fa5f73d1c5

Job Views:

1

Posted:

16.05.2025

Expiry Date:

30.06.2025

col-wide

Job Description:

You as part of the tonies:

As a Site Reliability Engineer (all genders) within the Production Systems team at tonies, you will be responsible for ensuring the reliability, availability, and performance of our on-premise

Read job description in:

You as part of the tonies:

As a Site Reliability Engineer (all genders) within the Production Systems team at tonies, you will be responsible for ensuring the reliability, availability, and performance of our on-premise
bare-metal and cloud systems.

Your tasks and responsibilities will include:

  • Infrastructure Management: Design, deploy and manage our on-premise bare-metal and cloud infrastructures ensuring reliability, scalability and performance of our systems.
  • Deployment and Automation: Streamline and automate deployment processes, leveraging CI/CD pipelines and automation tools. Ensure reliable and consistent deployment of system services and custom applications across cloud and on-premise environments.
  • Cloud and On-Premise Services Optimization: Manage and optimise our hybrid infrastructure, including AWS cloud services and on-premise bare-metal systems. Ensure cost-effective and efficient operation of all infrastructure components.
  • Reliability Engineering: Design and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services ensuring they are met or exceeded. Implement error budgets and policies to balance reliability with feature development.
  • Incident Management: Lead incident response efforts, conduct post-incident reviews (PIRs), and identify opportunities for proactive improvements, covering incidents in both cloud and on-premise environments.
  • Monitoring and Alerting: Develop and maintain robust monitoring and alerting systems to detect and respond to issues in real time, covering all components of the hybrid infrastructure. Ensure early problem detection and resolution.
  • Collaboration: Collaborate closely with software engineers, and cross-functional teams to continuously improve the reliability and performance of our hybrid infrastructure through automated testing, monitoring, and proactive maintenance.
  • Documentation: Create and maintain documentation for system configurations, processes, and best practices, encompassing all infrastructure components. Facilitate knowledge sharing within the team.
  • What we are looking for:

  • 5+ years of progressive experience in Reliability Engineering with proven experience in implementing SLOs, SLIs and other best practices from the SRE methodology
  • Proficiency in programming languages like Python, Go, or Rust
  • Proficiency in Linux-based systems administration and scripting
  • Experience with AWS cloud services and infrastructure
  • Experience with on-premise bare-metal infrastructure
  • Hands-on experience with CI/CD pipelines. (Gitlab CI/CD is a plus)
  • Knowledge of infrastructure-as-code (IaC) tools like Terraform or Ansible
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes
  • Demonstrated ability to manage and optimise critical systems in challenging environments.
  • Expertise in monitoring, alerting, and logging tools
  • Strong problem-solving skills and a proactive approach to system reliability
  • Excellent communication and collaboration skills, very good knowledge of English
  • Leadership experience or a proven track record of guiding reliability improvements is a plus
  • Bachelor's degree in a relevant field or equivalent practical experience is a plus
  • Why tonies?

  • Global Teamwork: We collaborate across departmental and country borders on our vision to bring the Toniebox into every child's room in the world.
  • Come as you are: This applies not only to the dress code but also to everything else. Because only where you truly feel comfortable can you give your best.
  • Mobility: Choose the option that suits you best - a Deutschlandticket (public transport ticket) for unlimited mobility, a monthly contribution of fifty Euros for an office parking space, a leasing bicycle, or as a remote work subsidy.
  • Enhanced Security: Benefit from subsidies for company pension plans, occupational pension schemes, and occupational disability insurance.
  • Rest & Time Off: Enjoy 30 days of paid annual leave as well as three additional days off such as Rosenmontag, Christmas Eve, and New Year's Eve. After one year of employment, you can also use up to 10 "toniecation days" (unpaid leave days).
  • Flexible Working: Equipped with individual equipment, you can work remotely for up to 5 days in consultation with your team - depending on your area of responsibility. And if you're up for a workation, you can work from abroad for up to 4 weeks per year with us.
  • Continuous Learning: Benefit from our internal and external training opportunities as well as an individual learning budget to continuously expand your knowledge.
  • Language Learning & Relaxation: Improve your communication skills with the language learning app Babbel and find relaxation through our access to the meditation app Calm.
  • Discounts: Benefit from attractive discounts on our entire range of tonies products.
  • Good to know:

    As part of our principles, we are committed to supporting inclusion and diversity at tonies. We actively celebrate our colleagues’ different abilities, ethnicities, faith and gender. Everyone is welcome and supported in their development at all stages in their journey with us.

    We look forward to hearing from you!

    Esther Sommerfeld
    Talent Acquisition Lead

    About us

    tonies is the world’s largest interactive audio platform for children with more than 7 million Tonieboxes and 88 million Tonies sold. The intuitive and award-winning audio system has changed the way young children play and learn independently with its child-safe, wireless, and screen-free approach. Tonieboxes have been activated in over 100 countries, the content portfolio includes more than 1,100 Tonies figurines in several languages.

    You as part of the tonies:

    As a Site Reliability Engineer (all genders) within the Production Systems team at tonies, you will be responsible for ensuring the reliability, availability, and performance of our on-premise
    bare-metal and cloud systems.

    Your tasks and responsibilities will include:

  • Infrastructure Management: Design, deploy and manage our on-premise bare-metal and cloud infrastructures ensuring reliability, scalability and performance of our systems.
  • Deployment and Automation: Streamline and automate deployment processes, leveraging CI/CD pipelines and automation tools. Ensure reliable and consistent deployment of system services and custom applications across cloud and on-premise environments.
  • Cloud and On-Premise Services Optimization: Manage and optimise our hybrid infrastructure, including AWS cloud services and on-premise bare-metal systems. Ensure cost-effective and efficient operation of all infrastructure components.
  • Reliability Engineering: Design and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services ensuring they are met or exceeded. Implement error budgets and policies to balance reliability with feature development.
  • Incident Management: Lead incident response efforts, conduct post-incident reviews (PIRs), and identify opportunities for proactive improvements, covering incidents in both cloud and on-premise environments.
  • Monitoring and Alerting: Develop and maintain robust monitoring and alerting systems to detect and respond to issues in real time, covering all components of the hybrid infrastructure. Ensure early problem detection and resolution.
  • Collaboration: Collaborate closely with software engineers, and cross-functional teams to continuously improve the reliability and performance of our hybrid infrastructure through automated testing, monitoring, and proactive maintenance.
  • Documentation: Create and maintain documentation for system configurations, processes, and best practices, encompassing all infrastructure components. Facilitate knowledge sharing within the team.
  • What we are looking for:

  • 5+ years of progressive experience in Reliability Engineering with proven experience in implementing SLOs, SLIs and other best practices from the SRE methodology
  • Proficiency in programming languages like Python, Go, or Rust
  • Proficiency in Linux-based systems administration and scripting
  • Experience with AWS cloud services and infrastructure
  • Experience with on-premise bare-metal infrastructure
  • Hands-on experience with CI/CD pipelines. (Gitlab CI/CD is a plus)
  • Knowledge of infrastructure-as-code (IaC) tools like Terraform or Ansible
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes
  • Demonstrated ability to manage and optimise critical systems in challenging environments.
  • Expertise in monitoring, alerting, and logging tools
  • Strong problem-solving skills and a proactive approach to system reliability
  • Excellent communication and collaboration skills, very good knowledge of English
  • Leadership experience or a proven track record of guiding reliability improvements is a plus
  • Bachelor's degree in a relevant field or equivalent practical experience is a plus

  • Why tonies?

  • Global Teamwork: We collaborate across departmental and country borders on our vision to bring the Toniebox into every child's room in the world.
  • Come as you are: This applies not only to the dress code but also to everything else. Because only where you truly feel comfortable can you give your best.
  • Mobility: Choose the option that suits you best - a Deutschlandticket (public transport ticket) for unlimited mobility, a monthly contribution of fifty Euros for an office parking space, a leasing bicycle, or as a remote work subsidy.
  • Enhanced Security: Benefit from subsidies for company pension plans, occupational pension schemes, and occupational disability insurance.
  • Rest & Time Off: Enjoy 30 days of paid annual leave as well as three additional days off such as Rosenmontag, Christmas Eve, and New Year's Eve. After one year of employment, you can also use up to 10 "toniecation days" (unpaid leave days).
  • Flexible Working: Equipped with individual equipment, you can work remotely for up to 5 days in consultation with your team - depending on your area of responsibility. And if you're up for a workation, you can work from abroad for up to 4 weeks per year with us.
  • Continuous Learning: Benefit from our internal and external training opportunities as well as an individual learning budget to continuously expand your knowledge.
  • Language Learning & Relaxation: Improve your communication skills with the language learning app Babbel and find relaxation through our access to the meditation app Calm.
  • Discounts: Benefit from attractive discounts on our entire range of tonies products.
  • Good to know:

    As part of our principles, we are committed to supporting inclusion and diversity at tonies. We actively celebrate our colleagues’ different abilities, ethnicities, faith and gender. Everyone is welcome and supported in their development at all stages in their journey with us.

    We look forward to hearing from you!

    Esther Sommerfeld
    Talent Acquisition Lead

    Über uns

    tonies ist die weltweit größte interaktive Audioplattform für Kinder mit mehr als 7 Millionen verkauften Tonieboxen und 88 Millionen Tonies. Das intuitive und preisgekrönte Audiosystem hat mit seinem kindersicheren, kabellosen und bildschirmfreien Ansatz die Art und Weise verändert, wie kleine Kinder unabhängig spielen und lernen. Tonieboxen wurden in über 100 Ländern aktiviert, das Portfolio umfasst mehr als 1.100 Tonies-Figuren in mehreren Sprachen.

    Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
    eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.