Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer (SRE)

Blackfluo.ai

Remote

EUR 70.000 - 90.000

Vollzeit

Vor 30+ Tagen

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A leading company is seeking a Site Reliability Engineer (SRE) to enhance and secure its AWS infrastructure. This pivotal role involves automating operations, managing CI/CD processes, and ensuring system reliability in a fully remote work environment, allowing for flexible scheduling and a high degree of autonomy.

Leistungen

100% remote work

Flexible hours

High-impact role with autonomy

Collaborative international team

Cutting-edge tech stack

Qualifikationen

5+ years of experience as an SRE or similar role.
Deep knowledge of AWS services (EC2, ECS, RDS, etc.).
Proficient in infrastructure-as-code tools (Terraform, CloudFormation).

Aufgaben

Design, implement, and maintain scalable AWS infrastructure.
Develop and manage CI/CD pipelines and infrastructure-as-code.
Set up and optimize monitoring, alerting, and incident response.

Kenntnisse

AWS services

infrastructure-as-code

Linux systems administration

networking concepts

CI/CD tools

observability tools

Tools

Terraform

CloudFormation

GitLab CI

Jenkins

Prometheus

Grafana

Datadog

About the job Site Reliability Engineer (SRE)

Job Description

Location: Full remote, EU timezone (CET +/- 2 hours)
Start Date: As soon as possible
Languages: English required

We are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of our production systems. Youll work closely with engineering teams to automate operations, improve monitoring, and design resilient systems.

Responsabilities:

Design, implement, and maintain scalable, resilient AWS infrastructure
Develop and manage CI/CD pipelines and infrastructure-as-code (Terraform or similar)
Set up and optimize monitoring, alerting, and incident response processes
Proactively identify and resolve performance, reliability, and security issues
Collaborate with development teams to integrate SRE best practices into their workflows
Conduct post-mortems and root cause analyses on incidents
Participate in on-call rotations to support 24/7 system reliability

Requirements:

5+ years of experience as an SRE or similar role
Deep knowledge of AWS services (EC2, ECS, RDS, Lambda, S3, etc.)
Proficient in infrastructure-as-code tools (Terraform, CloudFormation, etc.)
Solid experience with Linux systems administration and networking concepts
Experience with CI/CD tools (GitLab CI, Jenkins, etc.)
Familiarity with observability tools (Prometheus, Grafana, Datadog, etc.)

Nice To Have:

Experience with container orchestration (ECS, EKS, or Kubernetes)
Understanding of security best practices in cloud environments
Exposure to incident management frameworks (SRE handbook, etc.)

Why Join Us:

100% remote work with flexible hours
High-impact role with autonomy and ownership
Collaborative and international engineering team
Cutting-edge tech stack with strong focus on reliability and automation.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Standorte

Top-Unternehmen

Top-Positionen