Job Search and Career Advice Platform

Ativa os alertas de emprego por e-mail!

GenAI Data Scientist

Relevance Lab

Rio de Janeiro

Presencial

BRL 160.000 - 200.000

Tempo integral

Hoje
Torna-te num dos primeiros candidatos

Cria um currículo personalizado em poucos minutos

Consegue uma entrevista e ganha mais. Sabe mais

Resumo da oferta

A technology solutions provider in Brazil seeks a Senior Data Scientist with expertise in NLP and large language models. This role involves architecting frameworks for GenAI products, mentoring junior data scientists, and collaborating with cross-functional teams to develop innovative AI solutions. Candidates should possess 8-10 years of experience, an advanced degree in a relevant field, and strong proficiency in Python and deep learning frameworks. This position offers the opportunity to drive impactful AI projects using cutting-edge technologies.

Qualificações

  • 8 to 10 years of overall experience in data science.
  • Minimum of three years in developing machine learning models.
  • Proficiency in Python for at least five years.
  • Experience with production NLP models and large language models.

Responsabilidades

  • Architect framework for GenAI products such as chatbots and summarizers.
  • Collaborate with product management to align technical roadmap.
  • Establish protocols for transparent LLM applications.
  • Mentor junior data scientists in GenAI methods.

Conhecimentos

Python
Machine Learning
NLP Techniques
Deep Learning
AI Applications
Data Visualization
Langchain
Cloud Computing

Formação académica

Advanced degree in relevant field

Ferramentas

PyTorch
TensorFlow
AWS
Descrição da oferta de emprego

As a Senior Data Scientist, you will translate the needs of our cross-functional stakeholders into user-facing applications that leverage NLP techniques and large language models (LLMs). As a Sr. Data Scientist on our GenAI applications team, you will work on products like conversational search interfaces, chatbots, text summarizers, recommender engines, and more based on the needs of the constituents. You will partner with Product Managers, Machine Learning Engineers, Cloud Platform Engineers, and cross-functional partners to develop production-grade algorithms.

You will contribute to the creation, delivery, and production of specific data science, machine learning and AI products for internal stakeholders; directly mentor other data professionals, data analysts, and machine learning engineers. In conjunction with the key partners, you will make technical decisions on data, technology, and ways of working

Duties and Responsibilities
  • Architect the overall framework and infrastructure for GenAI products like search interfaces, bots, summarizers, etc. Develop and implement techniques to optimize model performance to meet specific product goals
  • Collaborate closely with product management and engineering leads to align on technical roadmap. Guide engineering teams to effectively leverage LLM capabilities in product implementations
  • Establish protocols and systems for building fair, accountable and transparent LLM-based applications. Lead efforts to proactively assess and mitigate risks due to model biases or failures
  • Implement robust feedback pipelines, monitoring and corrections to ensure model safety
  • Design and oversee curation of high-quality datasets tailored for LLM training for each product. Build data science pipelines from feature generation, data visualization and models evaluation; design the solution, build initial code and provide documentation with ways of working to maximize time to value and re-usability.
  • Communicate clearly and effectively to technical and non-technical audiences, verbally and visually, to create understanding, engagement, and buy-in. Contribute novel research and analyses to leading academic conferences and journals.
  • Identify trends and opportunities to drive innovation, both in what we do and how we do it; evaluate new data science, machine learning, and AI technologies and tools that can boost team performance, innovation and business value. Proactively analyze latest developments in large language models to deeply understand model capabilities, limitations, and best practices. Develop techniques to continually improve language understanding and model training
  • Mentor and develop junior data scientists in state-of-the-art GenAI methods
  • Set technical vision and lead initiatives to accelerate product impact through cutting-edge LLM innovations
  • Complete other responsibilities as assigned.
Required Skills and Qualifications
  • Total Exp: 8 to 10yrs+
  • Advanced degree in mathematics, physics, computer science, engineering, statistics, or an equivalent technical discipline.
  • Minimum of three years’ experience in developing machine learning models with a track record of creating meaningful business impact and working with multiple stakeholders.
  • Minimum of five years’ experience with Python.
  • Minimum of three year’s experience building production NLP and deep learning models using PyTorch/Tensorflow, along with using large language model architectures (BERT, GPT-3 etc.)
  • Experience building advanced workflows such as retrieval augmented generation, model chaining, dynamic prompting, PEFT/SFT, etc. using Langchain and similar tools
  • Experience establishing model guardrails and developing bias detection and mitigation techniques for AI applications
  • Proficiency with various prompting techniques, with a clear understanding of tradeoffs between prompting and finetuning
  • Experience with finetuning embedding models and tuning vector databases to improve performance of semantic search and retrieval systems
  • Deep understanding of underlying fundamentals such as Transformers, Self-Attention mechanisms that form the theoretical foundation of LLMs
  • Experience with cloud computing platforms and tools (AWS).
  • Experience operationalizing end-to-end machine learning applications.
Obtém a tua avaliação gratuita e confidencial do currículo.
ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.