¡Activa las notificaciones laborales por email!

Data Scientist - Gen AI Specialist

Indegene

Marbella

Presencial

EUR 50.000 - 70.000

Jornada completa

Ayer
Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A global consultancy in Andalucía is seeking experienced AI Specialists to develop and train Generative AI models, perform data analysis, and integrate AI models with Snowflake and AWS. The ideal candidate has a strong background in machine learning and Generative AI, with proficiency in unstructured data processing and experience in AWS cloud solutions.

Formación

  • Strong knowledge in machine learning and Generative AI.
  • Experience building scalable (Gen) AI applications on AWS.
  • Proficiency in unstructured data processing including PDF and Word parsing.

Responsabilidades

  • Develop and train Generative AI models.
  • Perform data analysis and prepare data for AI model training.
  • Integrate AI models with Snowflake and AWS.

Conocimientos

Machine learning
Generative AI
AWS Bedrock
OpenAI models
Unstructured data processing
OCR tools
Python
Git

Herramientas

spaCy
NLTK
Hugging Face Transformers
AWS Textract
Tesseract OCR
FAISS
Pinecone

Descripción del empleo

Who are we?

Indegene is a global consultancy at the forefront of driving innovation in the Pharmaceutical and Life Sciences industry, combining medical and commercial expertise with innovative digital and AI technologies.

We enable global healthcare organizations to address complex challenges and drive better health and business outcomes by seamlessly integrating analytics, technology, operations, and medical expertise. Find out more at indegene.com.

Who are you?

We are seeking experienced AI Specialists to develop and train Generative AI models, perform data analysis, and prepare data for AI model training. The role also involves integrating AI models with Snowflake, AWS, and other systems.

Required knowledge:

  • Strong knowledge in machine learning and Generative AI, especially content generation using AWS Bedrock and OpenAI models.
  • Experience building scalable (Gen) AI applications on AWS.
  • Proficiency in unstructured data processing & extraction, including PDF, Word, and HTML parsing using tools like PyMuPDF, Apache Tika, or Textract.
  • Familiarity with OCR tools such as Tesseract OCR, AWS Textract, or Azure Form Recognizer.
  • Ability to clean, preprocess, and structure text using spaCy, NLTK, or Hugging Face Transformers, with knowledge of NER.
  • Experience with vector databases like FAISS, Qdrant, Pinecone, Weaviate, or ChromaDB for semantic search and retrieval.
  • Understanding of embedding models such as OpenAI, Cohere, or Sentence-BERT.
  • Experience in chunking and indexing large documents while maintaining metadata.
  • Strong background in vector databases; experience with Snowflake-based (Gen) AI solutions and graph databases is a plus.
  • Knowledge of Agentic AI is a plus.
  • Proficiency in Python for data science and streamlit for rapid prototyping.
  • Experience with Git, Azure DevOps, CI/CD pipelines, and test-driven development.
  • Ability to work in an agile, international environment with good documentation and coaching skills.
  • Indegene is an Equal Opportunity Employer committed to inclusion and diversity. All qualified applicants will receive consideration without regard to race, religion, sex, or other protected characteristics.

Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.