Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer

Seedify

Deutschland

Remote

EUR 60.000 - 80.000

Vollzeit

Vor 3 Tagen
Sei unter den ersten Bewerbenden

Zusammenfassung

A cryptocurrency launchpad platform is seeking a highly skilled Site Reliability Engineer to optimize platform performance. This role requires extensive experience in AWS, Kubernetes, and DevOps practices. Responsibilities include managing infrastructure, deploying clusters, and implementing observability systems. Ideal candidates have at least 3 years of SRE experience and relevant cloud certifications. Join a fast-paced and innovative team in Germany.

Qualifikationen

  • At least 3 years in SRE or related roles with hands-on infrastructure ownership.
  • Experience designing reliability-focused SDLC integrations.
  • Familiarity with incident response and system optimization.

Aufgaben

  • Manage AWS infrastructure using Terraform.
  • Deploy and maintain Kubernetes clusters.
  • Own CI/CD pipelines to improve release velocity.
  • Implement monitoring and alerting systems.
  • Define SLAs and lead incident response efforts.
  • Collaborate with engineers on reliability.

Kenntnisse

Kubernetes
AWS
Terraform
Docker
Observability
Ansible
GitHub Actions

Ausbildung

Cloud/DevOps certifications

Tools

Prometheus
Grafana
New Relic

Jobbeschreibung

Seedify is a leading cryptocurrency launchpad platform dedicated to fostering innovation and success in the Web3 space. Our mission is to identify and assist promising teams and projects and offer outstanding returns to our investor base.

Job Description

We are seeking a highly skilled Site Reliability Engineer with extensive experience in DevOps, infrastructure optimization, and incident reporting & monitoring. In this role, you will work alongside other DevOps Engineers, a Technical Architect, and Developers to optimize the performance of the Seedify platform.

Responsibilities:
  1. Infrastructure & IaC: Manage AWS infrastructure using Terraform/Terragrunt; optionally Pulumi or AWS CDK. Optimize cost, reliability, and scalability.
  2. Kubernetes Operations: Deploy and maintain Kubernetes clusters with Helm and Kustomize. Architect for high availability and zero downtime.
  3. CI/CD Automation: Own pipelines in GitHub Actions to improve release velocity.
  4. Observability: Implement monitoring and alerting using New Relic, Prometheus, Grafana, and OpenTelemetry. Create health dashboards and custom metrics.
  5. Incident & SLA Management: Define SLAs, lead incident response, and conduct postmortems to improve reliability.
  6. Development Collaboration: Partner with engineers to embed reliability, monitoring, and alerting into the SDLC.
Skills & Qualifications:
  1. Core Tools: Kubernetes, Helm, Kustomize, Docker, Bash, Ansible.
  2. Cloud: Strong AWS experience (EC2, S3, EKS, RDS, Lambda, etc.).
  3. Observability Stack: Prometheus, Grafana, New Relic, OpenTelemetry.
  4. CI/CD: ArgoCD, GitHub.
  5. IaC: Terraform, Terragrunt; optional Pulumi/AWS CDK.
  6. Languages: Optional NodeJS or similar for automation.
  7. Certifications (optional but preferred): AWS Solutions Architect, Kubernetes Admin/Developer, or other cloud/DevOps certifications.
Experience:

At least 3 years in SRE or related roles, with hands-on infrastructure ownership, incident response, and system optimization. Experience designing reliability-focused SDLC integrations and dashboards.

Soft Skills:
  1. Collaboration: Works well across engineering, product, and operations teams.
  2. Ownership: Drives initiatives end-to-end, especially under pressure.
  3. Adaptability: Thrives in fast-paced, shifting environments.
Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.