Aktiviere Job-Benachrichtigungen per E-Mail!

Research Scientist - Model Team

RxREVU, Inc.

Berlin

Vor Ort

EUR 80.000 - 100.000

Vollzeit

Heute

Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A dynamic tech startup in Berlin is seeking a dedicated researcher for developing multimodal generative models for audio creation. In this role, you will contribute to the entire lifecycle of model development, from design and training to deployment. Ideal candidates will have a background in training large-scale generative models and a strong understanding of modern deep learning. This position offers competitive compensation and the opportunity to shape groundbreaking audio technologies.

Leistungen

Competitive compensation and equity

Autonomy and responsibility in work

Opportunity to innovate in audio technology

Qualifikationen

Experience in a fast-paced research environment.
Deep understanding of machine learning research methods.
Strong track record in working on generative models.

Aufgaben

Design and train large-scale multimodal generative models.
Explore new modeling ideas for audio generation.
Conduct ablation studies to gain insights.

Kenntnisse

Hands-on experience in training large-scale generative models

Strong proficiency in PyTorch

Understanding of distributed training techniques

Proficiency with debugging and optimizing GPU operations

Mirelo AI is building the next generation of creative tools by generating realistic sound, speech and music from video.

We develop cutting‑edge foundational generative AI models that "unmute" silent video content and create custom, hyper‑realistic audio for gaming, video platforms, and creators. Our technology empowers global storytellers to transform their content.

We recently closed a $41 million Seed round co‑led by Andreessen Horowitz and Index Ventures with participation from Atlantic, and are rapidly expanding across Product, Engineering, Go‑to‑Market, and Growth.

About the Role

At Mirelo, you'll work at the centre of how we build the next generation of multimodal video‑to‑audio models. This role is deeply hands‑on and research‑heavy: with a great H100/200‑per‑engineer ratio you explore and build new multimodal models and push the boundaries of what's possible in music, sound, and speech generation. You'll collaborate closely across research and engineering, run focused ablations, and translate experimental results into clear next steps for the team. From data curation to deployment, you'll help shape the full lifecycle of the models that power our products and partnerships.

Key Responsibilities

Design, implement and train large‑scale multimodal generative models for audio generation (diffusion and/or autoregressive models).
Explore new modeling ideas for audio generation (music, sound, speech) while taking inspiration from the language and image domains.
Develop and experiment with post‑training for new capabilities (fine‑grained control, in/out‑painting, editing, ...)
Conduct rigorous ablation studies, get actionable insights and communicate results to the team to discuss new research directions.
Contribute hands‑on to all stages of model development including data curation, experimentation, evaluation, and deployment.

Ideal Candidate Profile

Hands‑on experience in training large‑scale generative models in a fast‑paced research environment.
Deep understanding of cutting‑edge methods and ML research in at least one of the domains: image, language, video or audio (specific audio experience not necessary, but nice to have).
Strong proficiency in PyTorch, transformer architectures, and the full ecosystem of modern deep learning.
Solid understanding of distributed training techniques-FSDP, low precision training, model parallelism
Strong track‑record in working on generative models (publications in top‑tier venues, open‑source contributions or applied ML projects).

Nice to Have

Proficiency with profiling, debugging, and optimizing single and multi‑GPU operations using tools like Nsight or stack trace viewers.
Strong software engineering skills/experience in collaborating on large codebases that go beyond PhD research code.
Experience with generative models for audio (sound, music or speech) and audio codec design.

Why Join?

Join at a pivotal moment. We've secured fresh funding and are gaining traction - now is when your contributions can make a real difference to our success.
True ownership from day one. You'll have genuine autonomy and responsibility. Your ideas and work will directly shape our product and company direction.
Competitive compensation and equity. We offer strong packages that ensure you share in the success you help create.
Build for the next generation of creators. Be part of the innovation that will transform how creators work and thrive.

We welcome applications from all individuals, regardless of ethnic origin, gender, disability, religion or belief, age, or sexual orientation and identity.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Standorte

Top-Unternehmen

Top-Positionen