Aktiviere Job-Benachrichtigungen per E-Mail!

Customer Reliability Engineer

deepset GmbH

Berlin

Remote

EUR 70.000 - 90.000

Vollzeit

Vor 7 Tagen
Sei unter den ersten Bewerbenden

Zusammenfassung

A leading AI solutions company in Berlin is seeking a skilled Technical Specialist to own technical outcomes from Proof of Concept to production. Candidates must have extensive experience in Kubernetes and Python development for customer-facing applications. The ideal candidate will be part of a remote-first team with a strong culture of trust and flexibility. Competitive salary and benefits include 30 days of vacation and a monthly allowance for sports and mental health support.

Leistungen

Remote-first setup with flexible hours
30 days vacation, extra days for family/sick leave
Competitive salary & stock options
Monthly sports & mental health support allowance
Annual learning & development budget
Monthly team socials & in-person meetups
Dog-friendly Berlin HQ

Qualifikationen

  • Hands-on experience in programming language in Python.
  • 5 years across SRE/Platform/Solutions with evidence of shipping customer-facing builds.
  • Strong with Kubernetes, containers, Linux, IaC, CI/CD, and networking.
  • Enterprise integration experience; confident communicator with execs.

Aufgaben

  • Own technical outcomes from POC production and integrations.
  • Define SLOs/Error Budgets with customers; implement observability.
  • Train customer teams on operations and emergency procedures.
  • Document clearly: setup guides, SLOs, and testing procedures.

Kenntnisse

Kubernetes
Python
Linux
Terraform
Prometheus
Grafana
OpenTelemetry
Jobbeschreibung
TL;DR

Why this role exists : Land value fast and keep it running. You embed with strategic customers to design build and operationalize deployments of our platform and then leave behind all required resources and capabilities so that our customers feel confident and self-sufficient.

Why deepset

At deepset were on a mission to make custom AI solutions accessible to every organization. With Haystack thousands of developers build advanced LLM applications every day while our enterprise-ready AI Platform helps companies turn large language models into business value. Were remote-first flexible and built on a culture of trust and ownership. Youll collaborate with top-tier tech talent tackle meaningful challenges and help transform complex AI into solutions that are simple powerful and ready for the real world.

What you will do
Design & Land

Own technical outcomes from POC production : integrations data connectors workflows and infra-as-code (Kubernetes / Terraform / Helm).

Produce reference architectures and reusable templates; upstream patterns to Product to reduce future custom work.

Unblock enterprise environments : identity (OIDC / SAML) networking storage GPU scheduling observability hooks.

Run & Harden

Define SLOs / Error Budgets with customers; implement end-to-end observability (logs / metrics / traces) and dashboards.

Create runbooks / playbooks; lead L3 incident response and RCAs; drive reliability roadmaps to closure.

Plan / execute upgrades and security patches in change windows; ensure rollback and post-upgrade verification.

Be an active member of the on-call rotation to make sure we deliver excellent customer experience

Partner & Enable

Train customer teams on operations and emergency procedures; hand off cleanly to Support / CSM.

Prioritize reliability and productization backlog with Product / Engineering based on field signal.

Document clearly : setup guides diagrams SLOs testing / DR procedures and golden path standards.

Requirements

Hands on experience in programming language in Python (needed for improvements bug fixing and small feature builds)

5 years across SRE / Platform / Solutions / FDE with evidence of shipping customer-facing builds and operating production systems.

Strong with Kubernetes containers Linux IaC (Terraform / Helm) CI / CD networking (TLS DNS ingress / LB) backup / restore.

Observability stacks (Prometheus / Grafana / OpenTelemetry / ELK); scripting (Python / Bash).

Enterprise integration experience (SSO secrets compliance); confident communicator with execs and engineers under time pressure.

Must be resident of the European Union with an EU Passport

Nice to have
  • German language skills
Benefits
  • Remote-first setup with flexible hours & tech of your choice
  • 30 days vacation extra days for family sick leave
  • Competitive salary & stock options for every team member
  • Monthly sports & mental health support allowance with Oliva
  • Annual learning & development budget
  • Monthly team socials & in-person meetups
  • Dog-friendly Berlin HQ
About us

Founded in 2018 deepset builds open and enterprise-grade tools that help teams build AI with purpose. From Haystack our open-source framework to the deepset AI Platform we give developers and organizations the building blocks to solve complex high impact challenges with AI with full control transparency and sovereignty. Backed by GV and Balderton were growing the worlds production AI community and customer base solving challenges too critical to get wrong.

Visit us to learn more : deepset Website Haystack Website GitHub Linkedin X deepset (Twitter) X haystack (Twitter)

Key Skills

Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

Employment Type : Full-Time

Experience : years

Vacancy : 1

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.