¡Activa las notificaciones laborales por email!

Data Scientist - Gen Ai Specialist

Indegene

Tarragona

Presencial

EUR 50.000 - 75.000

Jornada completa

Ayer
Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A leading global consultancy is seeking an experienced AI Specialist in Tarragona, Spain, to develop and train Generative AI models, perform data analysis, and integrate AI solutions with AWS and Snowflake. Ideal candidates will have a strong background in machine learning, NLP, and must possess Python programming skills, with experience in an agile environment.

Formación

  • Good knowledge of machine learning and Generative AI.
  • Experience with unstructured data processing and extraction.
  • Proficiency in Python for data science.

Responsabilidades

  • Develop and train Generative AI models.
  • Perform data analysis and prepare data for AI model training.
  • Integrate AI models with Snowflake, AWS, and other systems.

Conocimientos

Machine learning and Generative AI
AWS Bedrock and OpenAI-based models
FAISS
Natural Language Processing (NLP)
Python
Agile work environment

Herramientas

PyMuPDF
Apache Tika
spaCy
Hugging Face Transformers
Git

Descripción del empleo

Who are we?

Indegene is a global consultancy at the forefront of driving innovation in the Pharmaceutical and Life Sciences industry, combining medical and commercial expertise with innovative digital and AI technologies.

We enable global healthcare organizations to address complex challenges and drive better health and business outcomes by seamlessly integrating analytics, technology, operations, and medical expertise. Find out more at indegene.com.

Who are you?

We are looking for experienced AI Specialists to:

  • Develop and train Generative AI models.
  • Perform data analysis and prepare data for AI model training.
  • Integrate AI models with Snowflake, AWS, and other systems.

Required knowledge:

  • Good knowledge of machine learning and Generative AI, especially content generation using AWS Bedrock and OpenAI-based models.
  • Strong experience in building scalable (Gen) AI applications on AWS.
  • Unstructured Data Processing & Extraction:
    • Experience with PDF, Word, and HTML parsing using tools like PyMuPDF, Apache Tika, or Textract.
    • Familiarity with Tesseract OCR, AWS Textract, or Azure Form Recognizer for extracting text from scanned documents.
  • Natural Language Processing (NLP):
    • Ability to clean, preprocess, and structure text using spaCy, NLTK, or Hugging Face Transformers, with familiarity in NER (Named Entity Recognition).
  • Vector Database & Embeddings:
    • Expertise in FAISS, Qdrant, Pinecone, Weaviate, or ChromaDB for semantic search and retrieval.
    • Understanding of OpenAI, Cohere, or Sentence-BERT embeddings for document similarity and retrieval.
    • Experience in chunking and indexing large documents into meaningful chunks, maintaining document structure and metadata.
  • Strong background and understanding of vector databases.
  • Experience in building (Gen) AI solutions on Snowflake is a plus.
  • Experience with Graph databases is a plus.
  • Experience with Agentic AI is a plus.
  • Proficiency in Python for data science and streamlit for rapid prototyping.
  • Good knowledge of Git, ideally Azure DevOps.
  • Experience working in an agile, international environment.
  • Experience setting up and using CI/CD pipelines and writing test-driven software.
  • Good documentation and coaching skills.

Equal Opportunity

Indegene is proud to be an Equal Employment Employer committed to inclusivity and diversity. We do not discriminate based on race, religion, sex, color, age, national origin, pregnancy, sexual orientation, physical ability, or any other characteristic. Employment decisions are based on merit and qualifications.

Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.