Enable job alerts via email!

Research Engineer, Speech Foundation Models

Tykhe Inc

Palo Alto (CA)

On-site

USD 150,000 - 230,000

Full time

4 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Tykhe Inc is seeking a Research Lead for Speech, Audio, and Conversational AI. In this role, you will drive research in advanced Audio Language Models and lead projects on speech synthesis and real-time AI applications, contributing to innovative solutions in speech technology.

Qualifications

Ph.D. focused on speech processing and audio analysis.
Expertise in developing and optimizing low-latency models.
Proficiency in deep learning frameworks and high-performance computing.

Responsibilities

Develop cutting-edge technologies in speech processing and AI.
Research and deploy novel generative AI methods.
Collaborate with cross-functional teams to integrate models into products.

Skills

Speech processing

Audio analysis

Machine learning

Multilingual speech recognition

Neural network architectures

Deep learning frameworks

Education

Ph.D. in Computer Science

Tools

TensorFlow

PyTorch

Research Engineer, Speech Foundation Models

We are seeking a highly skilled and experienced Research Lead for Speech, Audio, and Conversational AI to join our innovative team. In this role, you will spearhead the research and development of cutting-edge technologies in speech processing, text-to-speech (TTS), audio analysis, and real-time conversational AI. You will push the boundaries of what's possible in automatic speech recognition (ASR), speaker identification, diarization, speech synthesis, voice cloning, dubbing and audio generation.

Key Responsibilities:

Bring the state of the art in Audio/Speech and Large Language Models to develop advanced Audio Language Models and Speech Language Models.
Research, architect, and deploy new generative AI methods such as autoregressive models, causal models, and diffusion models
Design and implement low-latency end-to-end models with multilingual speech/audio as both input and output.
Conduct experiments to evaluate and improve the performance of these models, focusing on accuracy, naturalness, efficiency, and real-time capabilities across multiple languages.
Stay at the forefront of advancements in speech processing, audio analysis, and large language models, integrating new techniques into our foundation models.
Collaborate with cross-functional teams to integrate these foundation models into Krutrim's AI stack and products.
Publish research findings in top-tier conferences and journals such as INTERSPEECH, ICASSP, ICLR, ICML, NeurIPS, and IEEE/ACM Transactions on Audio, Speech, and Language Processing.
Mentor and guide junior researchers and engineers, fostering a collaborative and innovative team environment.
Drive the adoption of best practices in model development, including rigorous testing, documentation, and ethical considerations in multilingual AI.

Qualifications:

Ph.D. in Computer Science, Electrical Engineering, or a related field with a focus on speech processing, audio analysis, and machine learning.
Train speech / audio models for representation (like, W2V-BERT, SONAR, AST), generation (like, Hi-Fi GAN, VQ-GAN, AudioLDM), Conformers, multilingual multitask models (like, SeamlessM4T).
Expertise with Audio Language Models like AudioPALM, Moshi and Seamless M4T
Proven track record of developing and applying novel neural network architectures such as Transformers, Mixture of Experts, Diffusion Models, and State Space Machines (MAMBA, SAMBA).
Extensive experience in developing and optimizing models for low-latency, real-time applications.
Strong background in multilingual speech recognition, voice cloning, dubbing and synthesis, with an understanding of the challenges specific to different language families.
Proficiency in deep learning frameworks (e.g., TensorFlow, PyTorch) and experience deploying large-scale speech and audio models.
Demonstrated expertise in high-performance computing with proficiency in Python, C/C++, CUDA, and kernel-level programming for AI applications.
Experience with audio signal processing techniques and their application in end-to-end neural models.

Seniority level

Seniority level
Mid-Senior level

Employment type

Employment type
Full-time

Job function

Industries
Software Development

Referrals increase your chances of interviewing at Tykhe Inc by 2x

Get notified about new Research Engineer jobs in Palo Alto, CA.

Research Scientist/Engineer, Mobile Manipulation - Behaviors

Sunnyvale, CA $85.10-$251,000.00 3 weeks ago

Menlo Park, CA $85.10-$251,000.00 2 weeks ago

Sunnyvale, CA $124,000.00-$155,000.00 1 week ago

Milpitas, CA $115,900.00-$197,000.00 3 days ago

Sunnyvale, CA $124,000.00-$155,000.00 1 week ago

R&D Engineer, Detector Design, Model, and Analysis

Palo Alto, CA $100,000.00-$220,000.00 1 month ago

Santa Clara, CA $75,000.00-$170,000.00 1 month ago

Menlo Park, CA $131,000.00-$155,000.00 2 weeks ago

Scientist: Robot Learning for AI Supported Teleoperation ...

Research Scientist, AI/ML, Health Sensing and Insights

Mountain View, CA $141,000.00-$202,000.00 2 weeks ago

SEAL Research Scientist/ Research Engineer

San Jose, CA $182,900.00-$334,500.00 4 days ago

Mountain View, CA $125,400.00-$188,100.00 2 weeks ago

Stanford, CA $160,000.00-$200,000.00 2 weeks ago

ML Research Scientist (Senior / Staff / Principal)

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Research Engineer, Foundation Model Training Infrastructure

NVIDIA Corporation

Santa Clara

On-site

USD 224,000 - 357,000

4 days ago

Be an early applicant

⭐ AI & Robotics Research Engineer - Dexterous Manipulation

Proception.AI

Palo Alto

On-site

USD 120,000 - 160,000

3 days ago

Be an early applicant

Machine Learning Research Engineer (1 Year Fixed Term)

Stanford University

Stanford

On-site

USD 126,000 - 152,000

Yesterday

Be an early applicant

Machine Learning Research Engineer

Antler

San Francisco

On-site

USD 130,000 - 180,000

Yesterday

Be an early applicant

Research Engineer, World Models

Palo Alto

On-site

USD 130,000 - 250,000

2 days ago

Be an early applicant

Machine Learning Research Engineer

Prima Mente

San Francisco

On-site

USD 120,000 - 180,000

6 days ago

Be an early applicant

Senior Research Engineer - Applied Research

Luma AI

Palo Alto

On-site

USD 175,000 - 250,000

4 days ago

Be an early applicant

Senior Research engineer - Multimodal Language Models

Luma AI

Palo Alto

On-site

USD 200,000 - 300,000

4 days ago

Be an early applicant

Senior Research Engineer - Foundation Models

Luma AI

Palo Alto

On-site

USD 180,000 - 250,000

4 days ago

Be an early applicant

Research Engineer, Speech Foundation Models

Tykhe Inc

Palo Alto (CA)

On-site

USD 150,000 - 230,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Education

Tools

Job description

Similar jobs

Senior Research Engineer, Foundation Model Training Infrastructure

Santa Clara

On-site

USD 224,000 - 357,000

⭐ AI & Robotics Research Engineer - Dexterous Manipulation

Palo Alto

On-site

USD 120,000 - 160,000

Machine Learning Research Engineer (1 Year Fixed Term)

Stanford

On-site

USD 126,000 - 152,000

Machine Learning Research Engineer

San Francisco

On-site

USD 130,000 - 180,000

Research Engineer, World Models

Palo Alto

On-site

USD 130,000 - 250,000

Machine Learning Research Engineer

San Francisco

On-site

USD 120,000 - 180,000

Senior Research Engineer - Applied Research

Palo Alto

On-site

USD 175,000 - 250,000

Senior Research engineer - Multimodal Language Models

Palo Alto

On-site

USD 200,000 - 300,000

Senior Research Engineer - Foundation Models

Palo Alto

On-site

USD 180,000 - 250,000