Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer

Ranger Technical Resources

Dortmund

Vor Ort

EUR 70.000 - 100.000

Vollzeit

Vor 30 Tagen

Erhöhe deine Chancen auf ein Interview

Erstelle einen auf die Position zugeschnittenen Lebenslauf, um deine Erfolgsquote zu erhöhen.

Zusammenfassung

A leading PaaS company is seeking a Site Reliability Engineer in Dortmund to ensure the reliability and performance of critical infrastructure. The role involves building and maintaining scalable systems, optimizing CI/CD processes, and implementing advanced AWS features for enhanced security and uptime.

Qualifikationen

  • 10+ years of experience in Site Reliability Engineering or related roles.
  • Deep understanding of AWS and its modules/services.
  • Strong background in Linux administration and troubleshooting.

Aufgaben

  • Design and support EC2/ECS/EKS/Fargate environments for high availability.
  • Maintain and optimize CI/CD pipelines to streamline software delivery.
  • Implement monitoring tools to proactively detect and resolve system issues.

Kenntnisse

AWS
Auto Scaling
Observability tools
Scripting
CI / CD

Ausbildung

Bachelor or higher degree in Computer Science

Tools

New Relic
DataDog
Splunk
Ansible
Bash
Python
GO

Jobbeschreibung

Our partner, an innovative PaaS company specializing in remote monitoring and network management solutions, is looking for a Site Reliability Engineer to help ensure the reliability, scalability, and performance of critical infrastructure and applications. In this role, you’ll build and maintain highly available systems, support and optimize CI / CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high standards of uptime, security, and user experience for millions of endpoints.

Experience and Education :

  • Bachelor or higher degree in Computer Science, Information Systems, Information Technology, or a related technical field / experience.
  • 10+ years of experience in Site Reliability Engineering, DevOps, Infrastructure, or related roles.
  • Deep understanding of AWS and its various modules and services.
  • Strong background in Linux administration and troubleshooting.
  • Proven experience in implementing and managing CI / CD pipelines and Infrastructure as Code (IAC) solutions.
  • Proven experience in monitoring and observability tools to proactively manage system health.

Skills and Strengths :

  • AWS (Amazon Web Services)
  • Auto Scaling
  • Fargate
  • Route53
  • Observability tools (New Relic, DataDog, Splunk)
  • Scripting (Ansible, Bash, Python, GO)
  • CI / CD

Primary Job Responsibilities :

  • Design and support EC2 / ECS / EKS / Fargate environments for high availability and fault tolerance.
  • Implement advanced AWS features (Route53, ALB / NLB, multi-region setups) to ensure global reliability.
  • Maintain and optimize the existing CI / CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features.
  • Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes.
  • Implement, manage, and enhance monitoring tools to proactively detect and resolve system issues.
  • Administer and optimize Linux-based servers and applications, ensuring stability, performance, and security.
  • Implement and manage containerization solutions to improve scalability and efficiency.
  • Implement security best practices across AWS environments, ensuring compliance with industry standards and safeguarding cloud infrastructure.
  • Develop automated incident response mechanisms and self-healing solutions to minimize downtime and enhance fault tolerance.
  • Diagnose and resolve infrastructure, networking, and application-related performance issues to ensure operational efficiency.
  • Ensure business continuity by designing and maintaining robust backups, failover strategies, and disaster recovery solutions.
  • Identify, diagnose, and resolve infrastructure or application performance bottlenecks.
  • Create real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends.
  • Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance.
  • Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions.

Site Reliability Engineer • Dortmund, DE

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.