Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS

Abstract: 

Deep neural networks (DNNs) have been successfully deployed for acoustic modelling in statistical parametric speech synthesis (SPSS) systems. Moreover, DNN-based postfilters (PF) have also been shown to outperform conventional postfilters that are widely used in SPSS systems for increasing the quality of synthesized speech. However, existing DNN-based postfilters are trained with speaker-dependent databases. Given that SPSS systems can rapidly adapt to new speakers from generic models, there is a need for DNN-based postfilters that can adapt to new speakers with minimal adaptation data. Here, we compare DNN-, RNN-, and CNN-based postfilters together with adversarial (GAN) training and cluster-based initialization (CI) for rapid adaptation. Results indicate that the feedforward (FF) DNN, together with GAN and CI, significantly outperforms the other recently proposed postfilters.

up
0 users have voted:

Paper Details

Authors:
Miraç Göksu Öztürk,Okan Ulusoy,Cenk Demiroglu
Submitted On:
15 February 2019 - 5:17am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Miraç Göksu Öztürk
Paper Code:
2841
Document Year:
2019
Cite

Document Files

icassp19_poster.pdf

(116)

Subscribe

[1] Miraç Göksu Öztürk,Okan Ulusoy,Cenk Demiroglu, "DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/3851. Accessed: Jul. 23, 2019.
@article{3851-19,
url = {http://sigport.org/3851},
author = {Miraç Göksu Öztürk;Okan Ulusoy;Cenk Demiroglu },
publisher = {IEEE SigPort},
title = {DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS},
year = {2019} }
TY - EJOUR
T1 - DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS
AU - Miraç Göksu Öztürk;Okan Ulusoy;Cenk Demiroglu
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/3851
ER -
Miraç Göksu Öztürk,Okan Ulusoy,Cenk Demiroglu. (2019). DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS. IEEE SigPort. http://sigport.org/3851
Miraç Göksu Öztürk,Okan Ulusoy,Cenk Demiroglu, 2019. DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS. Available at: http://sigport.org/3851.
Miraç Göksu Öztürk,Okan Ulusoy,Cenk Demiroglu. (2019). "DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS." Web.
1. Miraç Göksu Öztürk,Okan Ulusoy,Cenk Demiroglu. DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/3851