Aktiviere Job-Benachrichtigungen per E-Mail!

Senior Specialist - Site Reliability Engineer III

On

Deutschland

Vor Ort

EUR 60.000 - 100.000

Vollzeit

Vor 3 Tagen
Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Starte ganz am Anfang oder importiere einen vorhandenen Lebenslauf

Zusammenfassung

Join a leading company as a Site Reliability Engineer in Berlin, where you'll enhance cloud infrastructure to support e-commerce platforms and customer applications. Responsibilities include optimizing performance, developing automation solutions, and ensuring reliability. You'll collaborate with teams to drive significant improvements, contributing to On's innovative digital future.

Leistungen

Supportive team-oriented atmosphere
Access to personal self-care tools
Commitment to fair and inclusive work environment

Qualifikationen

  • Experience in site reliability engineering managing high-traffic systems.
  • Expertise in cloud platforms (GCP, AWS).
  • Proficiency in scripting (e.g., Python, Go) for automation.

Aufgaben

  • Ensure high availability and performance of digital platforms.
  • Build and maintain cloud-based infrastructure.
  • Lead incident resolution and troubleshoot complex issues.

Kenntnisse

Automation
Cloud platforms
Networking
Scripting (Python, Go)
Monitoring

Ausbildung

Bachelor’s degree in Computer Science or related field

Tools

Terraform
Kubernetes
ArgoCD
GitHub Actions

Jobbeschreibung

In the dynamic landscape of On, the tech thrives much like a spirited runner: always moving, always improving. We are building technology that continues to supercharge the growth of On, helping to ignite the human spirit through movement. We’re seeking a Site Reliability Engineer to ensure our digital platforms deliver exceptional performance, reliability, and scalability to support our global customer base.

As a Site Reliability Engineer (SRE) at On, you will play an important role in building and maintaining our cloud infrastructure to support our e-commerce platforms, customer-facing applications, and internal systems. You will work closely with engineering teams to improve reliability, optimize performance, and implement automation solutions.

Your Mission
  1. System Reliability & Performance: Contribute to high availability (99.99%+ uptime), scalability, and performance of On’s digital platforms through proactive optimization and robust infrastructure design.
  2. Infrastructure Development: Build and maintain cloud-based infrastructure using Infrastructure-as-Code (IaC) tools.
  3. Automation: Develop and implement automation solutions to streamline deployments, reduce toil, and enhance monitoring.
  4. Incident Response: Lead incident resolution, perform troubleshooting, and root cause analyses towards minimizing downtime and improving system resilience.
  5. Monitoring & Observability: Improve and maintain monitoring, logging, and alerting solutions to ensure proactive issue detection and resolution.
  6. Collaboration: Partner with the SRE team and software engineers to identify opportunities, develop, and roll out major features.
  7. Compliance & Security: Integrate security best practices into our systems and solutions.
Your Story
  1. Experience in site reliability engineering with a track record of managing complex, high-traffic systems.
  2. Expertise in cloud platforms (GCP, AWS) and container orchestration (Kubernetes, GKE).
  3. Proficiency in scripting and programming (e.g., Python, Go) for automation and tooling.
  4. Experience with CI/CD pipelines (ArgoCD, GitHub Actions) and IaC (Terraform).
  5. Solid understanding of networking, load balancing, and DNS management.
  6. Experience with observability and monitoring for cloud native environments.
  7. Strong analytical skills with a proactive approach to resolving complex technical challenges.
  8. Excellent communication skills, with the ability to explain technical concepts to diverse stakeholders.

Nice to Have:

  1. Background with e-commerce platforms or high-traffic consumer applications.
  2. Experience in platform engineering, dedicated to building solutions that enhance developer experience (DevEx) and boost software development efficiency.
  3. Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
Meet The Team

You will join a skilled and dynamic team of cloud & site reliability engineers dedicated to transforming On’s technological foundation. We are crafting scalable, resilient cloud solutions to power internal operations, enhance product performance, and support On’s growth. As a key member of our team, you will shape our cloud infrastructure strategy, ensuring robust, efficient, and sustainable systems that drive innovation. Join us in Berlin, to make a lasting impact on On’s digital future!

What We Offer

On is a place that is centered around growth and progress. We offer an environment designed to give people the tools to develop holistically - to stay active, to learn, explore and innovate. Our distinctive approach combines a supportive, team-oriented atmosphere, with access to personal self-care for both physical and mental well-being, so each person is led by purpose. On is an Equal Opportunity Employer. We are committed to creating a work environment that is fair and inclusive, where all decisions related to recruitment, advancement, and retention are free of discrimination.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.