¡Activa las notificaciones laborales por email!

Senior ML Engineer / Researcher

Intellias

Vitoria

A distancia

EUR 50.000 - 80.000

Jornada completa

Hoy

Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A leading tech company is seeking a Senior ML Engineer/Researcher for a fully remote position based in Spain. The ideal candidate will have strong experience with Python and OCR technologies, focusing on developing high-accuracy document intelligence solutions. This role requires a strong research mindset and the ability to collaborate effectively with engineering teams. Join us to innovate in the field of document AI and contribute to cutting-edge projects.

Formación

Strong hands-on expertise with Python, PyTorch, and Hugging Face Transformers.
Practical experience deploying LLM/VLM models on vLLM or equivalent frameworks.
Solid understanding of OCR pipelines and document structure recognition.

Responsabilidades

Research and fine-tune open-source OCR and document intelligence models.
Develop end-to-end solutions for PDF conversion with high accuracy.
Collaborate with teams to integrate OCR pipelines into production systems.

Conocimientos

Python

vLLM

Hugging Face

Computer Vision

PyTorch

Educación

B2 level of English

Senior ML Engineer / Researcher

Location: Remote from Spain (Spanish employment contract)

We are actively experimenting with OCR and metadata extraction from PDF documents. OCR is one of the very hot topics these days with open models actively competing for the leading places – DeepSeek OCR, LightOn OCR, etc.

We are looking for someone with the experience of running OSS models on vLLM with focus on document intelligence – computer vision that results in PDF → Markdown or PDF → HTML conversion with high precision for complex documents.

Requirements – Tech Stack

Python
vLLM
Hugging Face (inference)
Computer Vision
PyTorch

Requirements

Strong hands‑on expertise with Python, PyTorch, and Hugging Face Transformers (training, fine‑tuning, inference).
Practical experience deploying LLM / VLM models on vLLM or equivalent high‑performance inference frameworks.
Solid understanding of OCR pipelines, layout parsing, and document structure recognition (PDFs, scanned docs, tables, mixed content).
Understanding of cloud infrastructure and GPU‑based inference pipelines.
Research mindset with the ability to experiment, analyze, and iterate quickly.
Strong communication and documentation skills; ability to clearly present findings and proposed improvements.
At least B2 level of English.

Responsibilities

Research, evaluate, and fine‑tune open‑source OCR and document intelligence models for text and layout extraction from complex PDFs.
Develop end‑to‑end solutions for PDF‑to‑Markdown / PDF‑to‑HTML conversion with high accuracy in text structure, formatting, and layout retention.
Build tools for data preprocessing, annotation, and quality evaluation of OCR outputs.
Implement techniques for post‑processing, text alignment, and metadata extraction to enhance model precision.
Collaborate with research and engineering teams to integrate OCR pipelines into production‑grade systems.
Stay up to date with the latest developments in document AI, multimodal learning, and OCR research.

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.