Sorry, you need to enable JavaScript to visit this website.

Voice Conversion

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms


This paper describes a method based on a sequence-to-sequence learning (Seq2Seq) with attention and context preservation mechanism for voice conversion (VC) tasks. Seq2Seq has been outstanding at numerous tasks involving sequence modeling such as speech synthesis and recognition, machine translation, and image captioning.

Paper Details

Authors:
Submitted On:
15 May 2019 - 7:03am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

2019_05_ICASSP_KouTanaka.pdf

(9)

Subscribe

[1] , "AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4522. Accessed: May. 23, 2019.
@article{4522-19,
url = {http://sigport.org/4522},
author = { },
publisher = {IEEE SigPort},
title = {AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms},
year = {2019} }
TY - EJOUR
T1 - AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4522
ER -
. (2019). AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms. IEEE SigPort. http://sigport.org/4522
, 2019. AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms. Available at: http://sigport.org/4522.
. (2019). "AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms." Web.
1. . AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4522

CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING


This paper presents a cross-lingual voice conversion approach using bilingual Phonetic PosteriorGram (PPG) and average modeling. The proposed approach makes use of bilingual PPGs to represent speaker-independent features of speech signals from different languages in the same feature space. In particular, a bilingual PPG is formed by stacking two monolingual PPG vectors, which are extracted from two monolingual speech recognition systems. The conversion model is trained to learn the relationship between bilingual PPGs and the corresponding acoustic features.

Paper Details

Authors:
Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li
Submitted On:
9 May 2019 - 3:46am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

cross lingual voice conversion with bilingual PPG

(5)

Subscribe

[1] Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li, "CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4165. Accessed: May. 23, 2019.
@article{4165-19,
url = {http://sigport.org/4165},
author = {Yi Zhou; Xiaohai Tian; Haihua Xu; Rohan Kumar Das and Haizhou Li },
publisher = {IEEE SigPort},
title = {CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING},
year = {2019} }
TY - EJOUR
T1 - CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING
AU - Yi Zhou; Xiaohai Tian; Haihua Xu; Rohan Kumar Das and Haizhou Li
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4165
ER -
Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li. (2019). CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING. IEEE SigPort. http://sigport.org/4165
Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li, 2019. CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING. Available at: http://sigport.org/4165.
Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li. (2019). "CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING." Web.
1. Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li. CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4165