Sorry, you need to enable JavaScript to visit this website.

Presentation of Diffusion-based speech enhancement with a weighted generative-supervised learning loss

Citation Author(s):
Jean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel
Submitted by:
Jean-Eudes AYILO
Last updated:
15 April 2024 - 11:09pm
Document Type:
Presentation Slides
Event:
Presenters:
Jean-Eudes AYILO
Categories:
 

Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods. These models transform clean speech training samples into Gaussian noise, usually centered on noisy speech, and subsequently learn a parameterized
model to reverse this process, conditionally on noisy speech. Unlike supervised methods, generative-based SE approaches often rely solely on an unsupervised loss, which may result in less efficient incorporation of conditioned noisy speech. To address this issue, we propose augmenting the original diffusion training objective with an ℓ2 loss, measuring the discrepancy between ground-truth clean speech and its estimation at each diffusion time-step. Experimental results demonstrate the effectiveness of our proposed methodology.

up
0 users have voted:

Comments

Diffusion-based speech enhancement with a weighted generative-supervised learning loss