¡Activa las notificaciones laborales por email!

Senior Machine Learning Engineer

Intellias

Madrid

Presencial

EUR 50.000 - 70.000

Jornada completa

Hoy

Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A tech solutions company in Madrid is looking for an experienced engineer to work on OCR and document intelligence projects. You will develop high-precision solutions for converting PDFs to Markdown and HTML, focusing on complex layouts. Ideal candidates have over 5 years of experience in Machine Learning, particularly in OCR. A strong background in Python and PyTorch is essential, along with excellent communication skills.

Servicios

Customized benefits for well-being

Professional growth opportunities

Formación

5+ years of experience in Machine Learning, with at least 2 years in OCR or Document AI.
Strong expertise in Python, PyTorch, and Hugging Face Transformers.
Hands-on experience deploying models on high-performance inference frameworks.

Responsabilidades

Research and fine-tune open-source OCR and document intelligence models.
Develop end-to-end solutions for PDF-to-Markdown/PDF-to-HTML conversion.
Collaborate with teams to integrate OCR pipelines into production systems.

Conocimientos

Machine Learning

OCR

Document AI

Python

PyTorch

Hugging Face Transformers

Computer Vision

Cloud Infrastructure

GPU-based Inference

Research-oriented Mindset

Let’s breathe life into great tech ideas! With 3,000 people globally, Intellias is a company where benchmark technological solutions are born. Join in and take your part in digitalizing the world.

We are exploring cutting-edge OCR and metadata extraction from PDF documents. OCR and document intelligence are rapidly evolving fields, with open-source models like DeepSeek OCR and LightOn OCR pushing the boundaries.

We are seeking an experienced engineer to help us build high-precision solutions for PDF-to-Markdown and PDF-to-HTML conversion, particularly for complex documents with diverse layouts.

Key Responsibilities

Research, evaluate, and fine-tune open-source OCR and document intelligence models for text and layout extraction from complex PDFs.
Develop end-to-end solutions for PDF-to-Markdown / PDF-to-HTML conversion, preserving text structure, formatting, and layout accuracy.
Build tools for data preprocessing, annotation, and quality evaluation of OCR outputs.
Implement post-processing techniques, text alignment, and metadata extraction to improve model precision.
Collaborate closely with research and engineering teams to integrate OCR pipelines into production-ready systems.
Stay current with advancements in document AI, multimodal learning, and OCR research.

Required Skills & Experience

5+ years of experience in Machine Learning, with at least 2 years focused on OCR, Document AI, or vision‑language models.
Strong expertise in Python, PyTorch, and Hugging Face Transformers (training, fine‑tuning, inference).
Solid understanding of ComputerVision and its implementation
Hands‑on experience deploying LLM / VLM models on vLLM or similar high‑performance inference frameworks.
Deep understanding of OCR pipelines, layout parsing, and document structure recognition (PDFs, scanned docs, tables, mixed content).
Familiarity with cloud infrastructure and GPU‑based inference pipelines.
Research‑oriented mindset with the ability to experiment, analyze results, and iterate quickly.
Excellent communication and documentation skills.

At Intellias, where technology takes center stage, people always come before processes. We're dedicated to cultivating a tech‑savvy environment that empowers individuals to unlock their true potential and achieve extraordinary results. Our customized benefits not only prioritize your well‑being but also charge your professional growth, making this opportunity an ideal match for tech enthusiasts like you.

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.