Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features

Abstract: 

This paper examines four approaches to improving real-time neural vocoders with simple acoustic features (SAF) constructed from fundamental frequency and mel-cepstra rather than mel-spectrograms. The investigations are as follows: 1) the effectiveness of single Gaussian (SG) autoregressive (AR) WaveNet and FFTNet vocoders with SAF, 2) the possibility of SG parallel WaveNet vocoder training and synthesis with SAF, 3) the impact of noise shaping on SG AR neural vocoders, and 4) the efficacy of bandwidth extension to synthesize speech waveforms at a sampling frequency of 24 kHz by SG AR neural vocoders from SAF for that of 16 kHz. The results of experiments indicate that SG AR WaveNet and real-time SG AR FFTNet vocoders with noise shaping using SAF can realize sufficient synthesis quality with bandwidth extension effect. Moreover, a real-time SG parallel WaveNet vocoder can also be trained using SAF.

https://ieeexplore.ieee.org/document/8682320

Additionally, demo samples synthesized by WaveRNN and WaveGlow vocoders with SAF will be provided in the poster session!!
Paper Code: SLP-P20.13
Session: Speech Synthesis II
Time: Friday, May 17, 08:30 - 10:30

up
0 users have voted:

Paper Details

Authors:
Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai
Submitted On:
10 May 2019 - 9:39pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Takuma Okamoto
Paper Code:
SLP-P20.13
Document Year:
2019
Cite

Document Files

icassp_2019_okamoto_1.pdf

(75)

Subscribe

[1] Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai, "Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4280. Accessed: Oct. 17, 2019.
@article{4280-19,
url = {http://sigport.org/4280},
author = {Takuma Okamoto; Tomoki Toda; Yoshinori Shiga; Hisashi Kawai },
publisher = {IEEE SigPort},
title = {Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features},
year = {2019} }
TY - EJOUR
T1 - Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features
AU - Takuma Okamoto; Tomoki Toda; Yoshinori Shiga; Hisashi Kawai
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4280
ER -
Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai. (2019). Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features. IEEE SigPort. http://sigport.org/4280
Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai, 2019. Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features. Available at: http://sigport.org/4280.
Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai. (2019). "Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features." Web.
1. Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai. Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4280