Aktiviere Job-Benachrichtigungen per E-Mail!

SRE / Platform Engineer

Antler

Berlin

Vor Ort

EUR 70.000 - 90.000

Vollzeit

Vor 30+ Tagen

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A fast-growing tech startup in Berlin is seeking a Site Reliability Engineer to enhance the reliability and scalability of their systems. The ideal candidate has over 5 years of experience in similar roles, deep expertise in Infrastructure as Code, and strong programming skills. This role offers significant ownership within a dynamic environment, competitive equity compensation, and a vibrant office space.

Leistungen

Equity compensation package

Paid Uber Eats when working late

Team events and off-sites

Beautiful office environment in Berlin

Qualifikationen

5+ years of experience in Site Reliability Engineering or similar.
Proficient with major cloud platforms (GCP, AWS, Azure).
Strong programming skills for automation and tooling.

Aufgaben

Own the reliability, scalability, and performance of systems.
Design and maintain tooling for service performance.
Develop incident response practices.

Kenntnisse

Site Reliability Engineering

Infrastructure Engineering

Terraform

Datadog

AWS

Python

Tools

Kubernetes

CI/CD

CloudFormation

What you’ll do

Own the reliability, scalability, and performance of Peec AI’s core systems and infrastructure
Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one
Develop and refine incident response practices, ensuring issues are triaged quickly and resolved with minimal user impact
Proactively identify and address bottlenecks, single points of failure, and operational inefficiencies across the stack
Champion operational excellence and a culture of reliability, driving best practices across the engineering organization

What we’re looking for

5+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or similar roles supporting production systems at scale
Deep expertise with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation, etc.)
Strong experience with observability platforms (e.g., Datadog, Sentry, Prometheus, Grafana) and incident response tooling (PagerDuty, Incident.io, or similar)
Proven proficiency with major cloud platforms (GCP, AWS, or Azure) and modern distributed systems
Strong programming and scripting skills (e.g., TypeScript and Python) for automation and tooling
A track record of diagnosing complex system problems and implementing robust, long‑term solutions
Solid understanding of CI/CD, Kubernetes, containerization, networking, databases, and cloud security principles
Excellent problem‑solving skills, attention to detail, and a strong commitment to operational excellence

Bonus Points

Experience supporting AI/ML workloads or data‑intensive systems
Prior SRE experience in a high‑growth startup or globally distributed infrastructure environment
Familiarity with zero‑downtime migrations, multi‑region architectures, or compliance frameworks

What we offer

Exciting and challenging work with real impact and ownership at one of Europe’s fastest‑growing Series A startups
Regular team events and off‑sites
Aggressive equity compensation package
Paid Uber Eats & Uber home when working late
The most beautiful office space and work environment in Berlin

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Standorte

Top-Unternehmen

Top-Positionen