¡Activa las notificaciones laborales por email!

(OP-362) Data Scientist - Gen AI Specialist

Indegene

Toledo

Presencial

EUR 40.000 - 65.000

Jornada completa

Hace 4 días
Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A global consultancy in healthcare is seeking an experienced AI Specialist in Toledo to develop and train Generative AI models and perform data analysis. Candidates should have strong knowledge in machine learning, particularly with AWS technologies and NLP tools, and experience in document parsing. This position offers opportunities to work with innovative technologies in a collaborative environment.

Formación

  • Strong experience in building scalable (Gen) AI applications on AWS.
  • Experience with document parsing tools.
  • Good knowledge in Python for data science.

Responsabilidades

  • Develop and train Generative AI models.
  • Perform data analysis and prepare data for training.
  • Integrate AI models with systems like Snowflake and AWS.

Conocimientos

Machine learning
Generative AI
Document Parsing
Optical Character Recognition (OCR)
Natural Language Processing (NLP)
Vector Databases
Python
Git

Herramientas

AWS
Snowflake
PyMuPDF
Apache Tika
Tesseract OCR
spaCy
Hugging Face Transformers

Descripción del empleo

Who are we?

Indegene is a global consultancy at the forefront of driving innovation in the Pharmaceutical and Life Sciences industry, through combining medical and commercial expertise with innovative digital and AI technologies.

We enable global healthcare organizations address complex challenges and drive better health and business outcomes by seamlessly integrating analytics, technology, operations, and medical expertise. Find out more at indegene.com

Who are you?

We are looking for experienced AI Specialists to
Develop and train Generative AI models.
Perform data analysis and prepare data for AI model training.
Integrate AI models with Snowflake, AWS and other systems.

Required knowledge:




Good knowledge in machine learning and Generative AI, especially content generation using AWS Bedrock and OpenAI based models.
Strong experience in building scalable (Gen) AI applications on AWS.
Unstructured Data Processing & Extraction
Document Parsing: Experience with PDF, Word, and HTML parsing using tools like PyMuPDF, Apache Tika, or Textract.
Optical Character Recognition (OCR): Familiarity with Tesseract OCR, AWS Textract, or Azure Form Recognizer for extracting text from scanned documents.
Natural Language Processing (NLP): Ability to clean, preprocess, and structure text using spaCy, NLTK, or Hugging Face Transformers, familiarity with NER Named Entity Recognition
Vector Database & Embeddings
Vector Databases: Expertise in FAISS, Qdrant, Pinecone, Weaviate, or ChromaDB for semantic search and retrieval.
Embedding Models: Understanding of OpenAI, Cohere, or Sentence-BERT embeddings for document similarity and retrieval.
Chunking & Indexing:



Experience in splitting large documents into meaningful chunks for efficient retrieval keeping in mind document structure and form and maintaining metadata
Strong background and understanding of vector databases.
Experience in building (Gen) AI solutions on Snowflake is a plus.
Experience with Graph databases is a plus
Experience with Agentic AI is a plus
Good knowledge in Python for data science, as well as streamlit for rapid deployment of prototypes.
Good knowledge in Git, ideally Azure DevOps.
Experience to work in an agile and international environment.
Experience in the setup and usage of CI/CD pipelines as well as writing software in a test-driven fashion.
Good documentation and coaching practice

EQUAL OPPORTUNITY
Indegene is proud to be an Equal Employment Employer and is committed to the culture of Inclusion and Diversity. We do not discriminate on the basis of race, religion, sex, colour, age, national origin, pregnancy,



sexual orientation, physical ability, or any other characteristics. All employment decisions, from hiring to separation, will be based on business requirements, the candidate’s merit and qualification.
We are an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, national origin, gender identity, sexual orientation, disability status, protected veteran status, or any other characteristics.

El anuncio original lo puedes encontrar en Kit Empleo:
https://www.kitempleo.es/empleo/219665837/op-362-data-scientist-gen-specialist-toledo/?utm_source=html

Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.