Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION

Abstract: 

Obtaining aligned spectral pairs in case of non-parallel data for stand-alone Voice Conversion (VC) technique is a challenging research problem. Unsupervised alignment algorithm, namely, an Iterative combination of a Nearest Neighbor search step and a Conversion step Alignment (INCA) iteratively tries to align the spectral features by minimizing the Euclidean distance metric between the intermediate converted and the target spectral feature vectors. However, the Euclidean distance may not correlate well with the perceptual distance between the two (sound or visual) patterns in a given feature space. In this paper, we propose to learn distance metric using Large Margin Nearest Neighbor (LMNN) technique that gives a minimum distance for the same phoneme uttered by the different speakers and more distance for the different set of phonemes. This learned metric is then used for finding the NN pairs in the INCA. Furthermore, we propose to use this learned metric only for the first iteration in the INCA, since the intermediate converted features (which are not the actual acoustic features) may not behave well w.r.t. the learned metric. We obtained on an average 7.93 % relative improvement in Phonetic Accuracy (PA). This is reflected positively in subjective and objective evaluations.

up
0 users have voted:

Paper Details

Authors:
Nirmesh Shah, Hemant A. Patil
Submitted On:
8 May 2019 - 8:07am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Nirmesh Shah
Paper Code:
MLSP-P15.2
Document Year:
2019
Cite

Document Files

main.pdf

(28)

Subscribe

[1] Nirmesh Shah, Hemant A. Patil, "NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4080. Accessed: Dec. 08, 2019.
@article{4080-19,
url = {http://sigport.org/4080},
author = {Nirmesh Shah; Hemant A. Patil },
publisher = {IEEE SigPort},
title = {NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION},
year = {2019} }
TY - EJOUR
T1 - NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION
AU - Nirmesh Shah; Hemant A. Patil
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4080
ER -
Nirmesh Shah, Hemant A. Patil. (2019). NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION. IEEE SigPort. http://sigport.org/4080
Nirmesh Shah, Hemant A. Patil, 2019. NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION. Available at: http://sigport.org/4080.
Nirmesh Shah, Hemant A. Patil. (2019). "NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION." Web.
1. Nirmesh Shah, Hemant A. Patil. NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4080