Aktiviere Job-Benachrichtigungen per E-Mail!

Site Reliability Engineer

ZipRecruiter

Braunschweig

Hybrid

EUR 60.000 - 90.000

Vollzeit

Vor 5 Tagen
Sei unter den ersten Bewerbenden

Erhöhe deine Chancen auf ein Interview

Erstelle einen auf die Position zugeschnittenen Lebenslauf, um deine Erfolgsquote zu erhöhen.

Zusammenfassung

A leading tech company focusing on semiconductor equipment seeks a professional to enhance their distributed computing platforms. This role involves troubleshooting, automating testing, and improving system stability to ensure high performance for global clients. The ideal candidate should have strong scripting skills and practical experience in distributed systems, working collaboratively with cross-functional teams.

Qualifikationen

  • Experience with DC/OS and zero-downtime tech introduction.
  • Proficient in scripting (Python) and expertise in Linux.
  • Willing to work remotely outside regular hours occasionally.

Aufgaben

  • Help application developers understand the infrastructure.
  • Improve VCP stability and reliability through automation.
  • Ensure customer satisfaction and resolve system-level issues.

Kenntnisse

Networking issues
Automated testing
CI/CD pipelines

Ausbildung

Practical experience with distributed computing systems

Tools

Ansible
Maven
Nexus
Bamboo
Github

Jobbeschreibung

Job Description

Our client is one of the world’s leading manufacturers of semiconductor chip-making equipment. A majority of the world’s microchips receive their critical lithographic patterning in machines made by this organisation. In addition, they produce metrology tools and advanced applications to analyze and optimize the performance of the customer production process.

Job Mission

Troubleshoot short-term problems and translate, develop into structural improvements on our distributed data and compute platform infrastructure. Be accurate, be precise and help drive up the aggregate availability of the installs of these distributed computing systems in Korea, Taiwan, Israel, China and the US. Be part of the computing platform that is one of the main pillars under the production of next-generation microchips for companies like Apple, Samsung, and others.

Responsibilities:

  • Create awareness in other teams about methods and procedures we use to help them prevent repetitive help requests.
  • Help application developers understand the infrastructure, cluster, and system.
  • Understand and explain how the system fits into the customer’s ecosystem.
  • Share knowledge and mindset with other teams (development and infrastructure engineers).
  • Contribute towards building VCP as a product that meets our quality standards.
  • Increase stability and reliability of VCP through automated testing and automation.
  • Ensure customer satisfaction and product reliability.
  • Improve the functionality and reliability of VCP.
  • Translate customer ecosystem needs into engineering deliverables.
  • Identify and resolve system/cluster-level issues.
  • Combine individual stories into a cohesive system.
  • Enhance system resilience to make VCP reliable, including bug fixing and structural improvements.
  • Implement regression tests and structural fixes to resolve bugs.
  • Manage predictable component lifecycle as an ambassador.
  • Maintain the technical roadmap (application lifecycle management).
  • Support feature and service requests from the field.
  • Suggest and implement improvements to our technical solutions and workflows in collaboration with your team and stakeholders.

Highly Valued Qualifications & Experiences:

  • Experience with DC/OS.
  • Experience with zero-downtime technology introduction, including data migration.
  • Passion for automated testing and qualification, ideally as part of CI/CD pipelines.
  • Deep understanding of networking issues.
  • Willingness to work remotely outside regular hours to build fail-safe systems, as an exception rather than the rule.

Required Qualifications & Experiences:

  • Practical experience with distributed computing systems.
  • Experience with build and release infrastructure such as Maven, Nexus, Bamboo, Github.
  • Proficiency in at least one scripting language (Python).
  • Experience with Ansible.
  • Expertise in Linux.
Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.