Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION

Abstract: 

Speech signal contains intrinsic and extrinsic variations such as accent, emotion, dialect, phoneme, speaking manner, noise, music, and reverberation. Some of these variations are unnecessary and are unspecified factors of variation. These factors lead to increased variability in speaker representation. In this paper, we assume that unspecified factors of variation exist in speaker representations, and we attempt to minimize variability in speaker representation. The key idea is that a primal speaker representation can be decomposed into orthogonal vectors and these vectors are recombined by using deep neural networks (DNN) to reduce speaker representation variability, yielding performance improvement for speaker verification (SV). The experimental results show that our proposed approach produces a relative equal error rate (EER) reduction of 47.1% compared to the use of the same convolutional neural network (CNN) architecture on the VoxCeleb dataset. Furthermore, our proposed method provides significant improvement for short utterances.

up
0 users have voted:

Paper Details

Authors:
Insoo Kim, Kyuhong Kim, Jiwhan Kim, Changkyu Choi
Submitted On:
13 May 2019 - 2:29am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Insoo Kim
Paper Code:
1161
Document Year:
2019
Cite

Document Files

Poster_InsooKim.pdf

(69)

Subscribe

[1] Insoo Kim, Kyuhong Kim, Jiwhan Kim, Changkyu Choi, "DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4477. Accessed: Nov. 18, 2019.
@article{4477-19,
url = {http://sigport.org/4477},
author = {Insoo Kim; Kyuhong Kim; Jiwhan Kim; Changkyu Choi },
publisher = {IEEE SigPort},
title = {DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION},
year = {2019} }
TY - EJOUR
T1 - DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION
AU - Insoo Kim; Kyuhong Kim; Jiwhan Kim; Changkyu Choi
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4477
ER -
Insoo Kim, Kyuhong Kim, Jiwhan Kim, Changkyu Choi. (2019). DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION. IEEE SigPort. http://sigport.org/4477
Insoo Kim, Kyuhong Kim, Jiwhan Kim, Changkyu Choi, 2019. DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION. Available at: http://sigport.org/4477.
Insoo Kim, Kyuhong Kim, Jiwhan Kim, Changkyu Choi. (2019). "DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION." Web.
1. Insoo Kim, Kyuhong Kim, Jiwhan Kim, Changkyu Choi. DEEP SPEAKER REPRESENTATION USING ORTHOGONAL DECOMPOSITION AND RECOMBINATION FOR SPEAKER VERIFICATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4477