Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer

InterEx Group

Essen

Remote

EUR 50.000 - 90.000

Vollzeit

Vor 30+ Tagen

Erhöhe deine Chancen auf ein Interview

Erstelle einen auf die Position zugeschnittenen Lebenslauf, um deine Erfolgsquote zu erhöhen.

Zusammenfassung

Join a forward-thinking company that is a leader in semiconductor manufacturing. This role focuses on enhancing distributed computing systems critical for the production of next-generation microchips. You will troubleshoot issues, improve system reliability, and work collaboratively with teams to ensure high-quality standards. If you are passionate about automated testing and eager to contribute to innovative solutions, this opportunity is perfect for you. Embrace the challenge of working in a dynamic environment where your expertise will directly impact the future of technology.

Qualifikationen

  • Experience with data centers and operating systems.
  • Familiarity with CI/CD pipelines and automated testing.
  • Knowledge of build and release infrastructure.

Aufgaben

  • Troubleshoot and improve distributed data and compute platform infrastructure.
  • Enhance stability and reliability through automated testing.
  • Support feature requests and propose technical improvements.

Kenntnisse

Distributed Computing Systems
Automated Testing
Networking Issues Diagnosis
Scripting (Python)

Ausbildung

Bachelor's Degree in Computer Science or related field

Tools

Maven
Nexus
Bamboo
GitHub

Jobbeschreibung

Our client is one of the world’s leading manufacturers of semiconductor chip-making equipment. A majority of the world’s microchips receive their critical lithographic patterning in machines made by this organisation. In addition, they produce metrology tools and advanced applications to analyze and optimize the performance of the customer production process.

Job Mission

Troubleshoot short-term problems and translate, develop into structural improvements on our distributed data and compute platform infrastructure. Be accurate, be precise and help drive up the aggregate availability of the installs of these distributed computing systems in Korea, Taiwan, Israel, China and the US. Be part of the computing platform that is a main pillar in the production of next-generation microchips for companies like Apple, Samsung, and others.

Responsibilities
  1. Create awareness in other teams about methods and procedures we use to help them prevent repetitive help requests.
  2. Help application developers understand the infrastructure, cluster, and systems.
  3. Understand and explain how the system fits into the customer’s ecosystem.
  4. Share knowledge and mindset with other teams (development and infrastructure engineers).
  5. Contribute towards building VCP as a product that meets our quality standards.
  6. Increase stability and reliability of VCP through automated testing and automation.
  7. Enhance customer satisfaction and product reliability.
  8. Improve the functionality and reliability of VCP.
  9. Translate customer ecosystem needs into engineering deliverables.
  10. Identify and resolve system and cluster issues.
  11. Integrate individual stories into comprehensive solutions.
  12. Make VCP reliable by improving system resilience, including bug fixes and structural improvements.
  13. Resolve bugs sustainably by implementing regression tests and structural fixes.
  14. Manage predictable component lifecycle as an ambassador.
  15. Maintain the technical roadmap (application lifecycle management).
  16. Support feature and service requests from the field.
  17. Propose and implement improvements to technical solutions and workflows in collaboration with stakeholders.
Highly Valued Qualifications & Experiences
  1. Experience with data centers and operating systems.
  2. Experience with zero-downtime technology introduction, including data migration.
  3. Passion for automated testing and qualification, ideally within CI/CD pipelines.
  4. Strong interest in diagnosing and resolving networking issues.
  5. Willingness to work remotely outside regular hours when necessary to build fail-safe systems (exceptions rather than the rule).
Required Qualifications & Experiences
  1. Practical knowledge of distributed computing systems.
  2. Experience with build and release infrastructure such as Maven, Nexus, Bamboo, Github.
  3. Familiarity with at least one scripting language, preferably Python.
Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.