Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning

Abstract: 

We introduce a novel type of representation learning to obtain a speaker invariant feature for zero-resource languages. Speaker adaptation is an important technique to build a robust acoustic model. For a zero-resource language, however, conventional model-dependent speaker adaptation methods such as constrained maximum likelihood linear regression are insufficient because the acoustic model of the target language is not accessible. Therefore, we introduce a model-independent feature extraction based on a neural network. Specifically, we introduce a multi-task learning to a bottleneck feature-based approach to make bottleneck feature invariant to a change of speakers. The proposed network simultaneously tackles two tasks: phoneme and speaker classifications. This network trains a feature extractor in an adversarial manner to allow it to map input data into a discriminative representation to predict phonemes, whereas it is difficult to predict speakers. We conduct phone discriminant experiments in Zero Resource Speech Challenge 2017. Experimental results showed that our multi-task network yielded more discriminative features eliminating the variety in speakers.

up
0 users have voted:

Paper Details

Authors:
Taira Tsuchiya, Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi
Submitted On:
13 April 2018 - 10:12am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Taira Tsuchiya
Paper Code:
MLSP-L8.1
Document Year:
2018
Cite

Document Files

speaker-invariant-feature-extraction-for-zero-resource-languages-with-adversarial-learning.pdf

(224)

Subscribe

[1] Taira Tsuchiya, Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi, "Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2709. Accessed: Apr. 23, 2019.
@article{2709-18,
url = {http://sigport.org/2709},
author = {Taira Tsuchiya; Naohiro Tawara; Tetsuji Ogawa; Tetsunori Kobayashi },
publisher = {IEEE SigPort},
title = {Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning},
year = {2018} }
TY - EJOUR
T1 - Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning
AU - Taira Tsuchiya; Naohiro Tawara; Tetsuji Ogawa; Tetsunori Kobayashi
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2709
ER -
Taira Tsuchiya, Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi. (2018). Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning. IEEE SigPort. http://sigport.org/2709
Taira Tsuchiya, Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi, 2018. Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning. Available at: http://sigport.org/2709.
Taira Tsuchiya, Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi. (2018). "Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning." Web.
1. Taira Tsuchiya, Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi. Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2709