Enable job alerts via email!

Fundamental AI Research Scientist, Multimodal Audio (Speech, Sound and Music) - FAIR

Facebook

New York (NY)

On-site

USD 147,000 - 208,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is on the lookout for Research Scientists to join their team focused on groundbreaking AI research. This role will involve developing cutting-edge algorithms and conducting research that advances the science of intelligent machines. You will collaborate with a diverse team of scientists and engineers, working on projects that explore audio understanding and multimodality. This position offers access to state-of-the-art technology and resources, allowing you to publish impactful research and contribute to products that connect billions of users. If you are passionate about AI and eager to make a difference, this opportunity is perfect for you.

Benefits

Bonus
Equity
Health Benefits
Flexible Working Hours

Qualifications

  • 2+ years of experience in research roles with publications in relevant fields.
  • Expertise in machine learning, audio generation, and multimodal research.

Responsibilities

  • Develop algorithms for intelligent machines using state-of-the-art methodologies.
  • Conduct research across multiple modalities including audio and vision.

Skills

Machine Learning
Neural Networks
Python Programming
Deep Learning Frameworks
Research Publication

Education

Bachelor's degree in Computer Science
PhD in AI or related fields

Tools

PyTorch
TensorFlow

Job description

Summary:

Meta is seeking Research Scientists to join its Fundamental AI Research (FAIR) organization, focused on making significant advances in AI. We publish groundbreaking papers and release frameworks/libraries that are widely used in the open-source community. The team is working on the industrial leading research on building foundation models for audio understanding and audio generation. We are also closely working with vision research teams on pushing the frontier of multimodality (audio, video, language) research. Our teams research is focusing on audio and multimodality. Individuals in this role are expected to be recognized experts in identified research areas such as artificial intelligence, speech and audio generation and audio-visual learning. Researchers will drive impact by: (1) publishing state-of-the-art research papers, (2) open sourcing high quality code and reproducible results for the community, and (3) bringing the latest research to Meta products for connecting billions of users. They will work with an interdisciplinary team of scientists, engineers, and cross-functional partners, and will have access to cutting edge technology, resources, and research facilities.

Required Skills:

Fundamental AI Research Scientist, Multimodal Audio (Speech, Sound and Music) - FAIR Responsibilities:

  1. Develop algorithms based on state-of-the-art machine learning and neural network methodologies.
  2. Perform research to advance the science and technology of intelligent machines.
  3. Conduct research that enables learning the semantics of data across multiple modalities (audio, speech, images, video, text, and other modalities).
  4. Work towards long-term ambitious research goals, while identifying intermediate milestones.
  5. Design and implement models and algorithms.
  6. Work with large datasets, train/tune/scale the models, create benchmarks to evaluate the performance, open source and publish.

Minimum Qualifications:

  1. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
  2. PhD degree in AI, computer science, data science, or related technical fields, or equivalent practical experience.
  3. 2+ years of experience holding an industry, faculty, academic, or government researcher position.
  4. Research publications reflecting experience in related research fields: audio (speech, sound, or music) generation, text-to-speech (TTS) synthesis, text-to-music generation, text-to-sound generation, speech recognition, speech/audio representation learning, vision perception, image/video generation, video-to-audio generation, audio-visual learning, audio language models, lip sync, lip movement generation/correction, lip reading, etc.
  5. Familiarity with one or more deep learning frameworks (e.g. pytorch, tensorflow, …).
  6. Experienced in Python programming language.
  7. Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.

Preferred Qualifications:

  1. First-authored publications at peer-reviewed conferences, such as ICML, NeuRIPS, ICLR, ICASSP, Interspeech, ACL, EMNLP, CVPR, and other similar venues.
  2. Research and engineering experience demonstrated via publications, grants, fellowships, patents, internships, work experience, open source code, and/or coding competitions.
  3. Experience solving complex problems and comparing alternative solutions, trade-offs, and diverse points of view.
  4. Experience working and communicating cross functionally in a team environment.
  5. Experience communicating research findings to public audiences of peers.

Public Compensation:

$147,000/year to $208,000/year + bonus + equity + benefits

Industry: Internet

Equal Opportunity:

Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.

Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Fundamental AI Research Scientist, Multimodal Audio (Speech, Sound and Music) - FAIR

Meta

New York

On-site

USD 147,000 - 208,000

30+ days ago