¡Activa las notificaciones laborales por email!

▷ 15 / 7 / 2025 Senior Devops Engineer Bx-724

WHATJOBS?

Madrid

Presencial

EUR 60.000 - 85.000

Jornada completa

Hace 27 días

Descripción de la vacante

Una empresa líder en consultoría digital busca un Senior DevOps-HPC Engineer. El puesto implica gestionar la migración de clusters HPC a Google Cloud, optimizar infraestructuras y colaborar en un entorno dinámico. Se requiere experiencia en SLURM, Linux y herramientas de automatización, así como habilidades de scripting. Ofrecemos un ambiente de trabajo inclusivo y oportunidades de carrera.

Formación

  • 5 años de experiencia en entornos HPC.
  • Experiencia en gestión de sistemas basados en Linux.
  • Experiencia con herramientas de automatización como Ansible y Terraform.

Responsabilidades

  • Liderar la migración de clusters HPC basados en SLURM a GCP.
  • Optimizar configuraciones SLURM y flujos de trabajo.
  • Colaborar con equipos de ingeniería y soporte para garantizar la migración.

Conocimientos

SLURM
Linux
Python
Bash
Ansible
Terraform
GCP
MPI

Descripción del empleo

About us

For more than 20 years, our global network of passionate technologists and pioneering craftspeople has delivered cutting-edge technology and game-changing consulting to companies on the brink of AI-driven digital transformation. Since 2001, we have grown into a full-service digital consulting company with 5500+ professionals working on a worldwide ambition. Driven by the desire to make a difference, we keep innovating, fueling our growth with a knowledge worker culture. When teaming up with Xebia, expect in-depth expertise based on an authentic, value-led, and high-quality way of working that inspires all we do.

At Xebia, we put ‘People First’—committed to attracting diverse talent and fostering an inclusive, respectful workplace where everyone is valued for their contributions. We welcome all individuals and evaluate solely on the quality of their work and teamwork.

About the Role

As a Senior DevOps-HPC Engineer at Xebia, you will join a dynamic engineering team in a high-energy and collaborative environment. This role is ideal for a seasoned HPC engineer with deep expertise in SLURM, Linux, and cloud migration, who thrives on leading complex projects, designing robust architectures, and implementing high-performance solutions in Google Cloud.

Responsibilities:

  1. Lead the migration of on-premises SLURM-based HPC clusters to Google Cloud Platform.
  2. Design, implement, and manage scalable and secure HPC infrastructure solutions on GCP.
  3. Optimize SLURM configurations and workflows to ensure efficient use of cloud resources.
  4. Manage and optimize HPC environments, focusing on workload scheduling, job efficiency, and scaling SLURM clusters.
  5. Automate cluster deployment, configuration, and maintenance tasks using scripting languages (Python, Bash) and automation tools (Ansible, Terraform).
  6. Integrate HPC software stack using tools like Spack for dependency management and easy installation of HPC libraries and applications.
  7. Deploy, manage, and troubleshoot applications using MPI, OpenMP, and other parallel computing frameworks on GCP instances.
  8. Collaborate with engineering, support teams, and stakeholders to ensure smooth migration and ongoing operation of HPC workloads.
  9. Provide expert-level support for performance tuning, job scheduling, and cluster resource optimization.
  10. Stay current with emerging HPC technologies and GCP services to continually improve HPC cluster performance and cost efficiency.

Requirements:

  • Minimum 5 years of experience with HPC environments, including SLURM workload manager, MPI, and other HPC-related software.
  • Extensive hands-on experience managing Linux-based systems, including performance tuning and troubleshooting in an HPC context.
  • Proven experience migrating and managing SLURM clusters in cloud environments, preferably GCP.
  • Proficiency with automation tools such as Ansible and Terraform for cluster deployment and management.
  • Experience with Spack for managing and deploying HPC software stacks.
  • Strong scripting skills in Python, Bash, or similar languages for automating cluster operations.
  • In-depth knowledge of GCP services relevant to HPC, such as Compute Engine, Cloud Storage, and VPC networking.
  • Strong problem-solving skills with a focus on optimizing HPC workloads and resource utilization.

Preferred:

  • Google Cloud Professional DevOps Engineer or similar GCP certifications.
  • Familiarity with GCP’s HPC-specific offerings, such as Preemptible VMs, HPC VM images, and other cost-optimization strategies.
  • Experience with performance profiling and debugging tools for HPC applications.
  • Advanced knowledge of HPC data management strategies, including parallel file systems and data transfer tools.
  • Understanding of container technologies (e.g., Singularity, Docker) specifically within HPC contexts.
  • Experience with Spark or other big data tools in an HPC environment is a plus.
  • Expertise in modules/libs migration, Spack containers, and Pipelines (GHA).

El anuncio original lo puedes encontrar en Kit Empleo: https://www.kitempleo.es/empleo/212349551/%e2%96%b7-15-7-2025-senior-devops-engineer-bx-724-madrid/?utm_source=html

Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.