Enable job alerts via email!

Member of Technical Staff, Multimodal

Boson AI

Santa Clara (CA)

On-site

USD 100,000 - 160,000

Full time

4 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Boson AI, an innovative startup in Santa Clara, is seeking full-time research scientists and engineers to advance generative multimodal models. In this role, you'll design model architectures and loss objectives while working with diverse datasets across various media formats. Ideal candidates possess a strong machine learning background and are passionate about contributing to cutting-edge AI technology.

Qualifications

  • Experience in writing clean and efficient code.
  • Participation in at least one research project related to multimodality learning.
  • Strong coding and deep learning skills.

Responsibilities

  • Design model architectures and loss objectives for multimodal data.
  • Build diverse datasets for multimodality learning.
  • Develop evaluation pipelines for various generative outputs.

Skills

Machine Learning
Deep Learning
Multimodal Learning
Data Collection
Data Processing

Education

Master’s or Doctoral degree in computer science or equivalent

Tools

PyTorch
JAX

Job description

Boson AI is an early-stage startup building large language tools for everyone to use. Our founders (Alex Smola, Mu Li), and a team of scientists and engineers in Deep Learning, Optimization, NLP, AutoML, and Statistics are working on high-quality generative AI models for language and beyond.

We are seeking research scientists and engineers to join our team full-time in our Santa Clara office. Your role will involve designing model architectures, proposing new loss objectives, and advancing generative multimodal models. The ideal candidate has a strong machine learning background and is motivated to develop state-of-the-art models towards AGI.

We encourage you to apply even if you do not meet every qualification. If you are motivated to learn and contribute to foundation models, we would love to chat.

Responsibilities
  • Design model architectures and loss objectives for multimodal data (images, video, text, speech, audio)
  • Build diverse datasets for multimodality learning, including data collection and processing
  • Develop evaluation pipelines for various generative outputs
Qualifications
  • Experience in writing clean and efficient code; Master’s or Doctoral degree in computer science or equivalent
  • Proficiency in deep learning frameworks like PyTorch or JAX
  • Participation in at least one research project related to multimodality learning
Preferred Qualifications
  • Experience in multimodal joint embedding, text-to-image/video generation, etc.
  • Experience in document understanding (layout analysis, data extraction, OCR)
  • Experience in audio processing tasks; active GitHub contributions are a plus
  • Experience handling data at billions-scale
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Member of Technical Staff - Vision-Language Model Data

Liquid AI

San Francisco

On-site

USD 60,000 - 240,000

10 days ago