Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA

Abstract: 

It is well known that recognizers personalized to each user are much more effective than user-independent recognizers. With the popularity of smartphones today, although it is not difficult to collect a large set of audio data for each user, it is difficult to transcribe it. However, it is now possible to automatically discover acoustic tokens from unlabeled personal data in an unsupervised way. We therefore propose a multi-task deep learning framework called a phoneme-token deep neural network (PTDNN), jointly trained from unsupervised acoustic tokens discovered from unlabeled data and very limited transcribed data for personalized acoustic modeling. We term this scenario ``weakly supervised''. The underlying intuition is that the high degree of similarity between the HMM states of acoustic token models and phoneme models may help them learn from each other in this multi-task learning framework. Initial experiments performed over a personalized audio data set recorded from Facebook posts demonstrated that very good improvements can be achieved in both frame accuracy and word accuracy over popularly-considered baselines such as fDLR, speaker code and lightly supervised adaptation. This approach complements existing speaker adaptation approaches and can be used jointly with such techniques to yield improved results.

up
0 users have voted:

Paper Details

Authors:
Cheng-Kuan Wei, Cheng-Tao Chung, Hung-Yi Lee, Lin-Shan Lee
Submitted On:
1 March 2017 - 12:56am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Cheng-Kuan Wei
Paper Code:
ICASSP1701
Document Year:
2017
Cite

Document Files

PTDNN_ICASSP2017_Poster_v5.3.pdf

(505)

Subscribe

[1] Cheng-Kuan Wei, Cheng-Tao Chung, Hung-Yi Lee, Lin-Shan Lee, "PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1531. Accessed: Oct. 15, 2019.
@article{1531-17,
url = {http://sigport.org/1531},
author = {Cheng-Kuan Wei; Cheng-Tao Chung; Hung-Yi Lee; Lin-Shan Lee },
publisher = {IEEE SigPort},
title = {PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA},
year = {2017} }
TY - EJOUR
T1 - PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA
AU - Cheng-Kuan Wei; Cheng-Tao Chung; Hung-Yi Lee; Lin-Shan Lee
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1531
ER -
Cheng-Kuan Wei, Cheng-Tao Chung, Hung-Yi Lee, Lin-Shan Lee. (2017). PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA. IEEE SigPort. http://sigport.org/1531
Cheng-Kuan Wei, Cheng-Tao Chung, Hung-Yi Lee, Lin-Shan Lee, 2017. PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA. Available at: http://sigport.org/1531.
Cheng-Kuan Wei, Cheng-Tao Chung, Hung-Yi Lee, Lin-Shan Lee. (2017). "PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA." Web.
1. Cheng-Kuan Wei, Cheng-Tao Chung, Hung-Yi Lee, Lin-Shan Lee. PERSONALIZED ACOUSTIC MODELING BY WEAKLY SUPERVISED MULTI-TASK DEEP LEARNING USING ACOUSTIC TOKENS DISCOVERED FROM UNLABELED DATA [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1531