Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION

Abstract: 

Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss. We show that our model improves performance on Babel languages by over 6% absolute in terms of word/phoneme error rate when compared to mono-lingual systems built in the same setting for these languages. We also show that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data. We show that training on multiple languages is important for very low resource cross-lingual target scenarios, but not for multi-lingual testing scenarios. Here, it appears beneficial to include large well prepared datasets.

up
2 users have voted: Shruti Palaskar, Siddharth Dalmia

Paper Details

Authors:
Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W Black
Submitted On:
18 April 2018 - 3:03pm
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Siddharth Dalmia
Paper Code:
4000
Document Year:
2018
Cite

Document Files

Dalmia_ICASSP_2018.pdf

(119)

Subscribe

[1] Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W Black, "SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2970. Accessed: Jun. 19, 2019.
@article{2970-18,
url = {http://sigport.org/2970},
author = {Siddharth Dalmia; Ramon Sanabria; Florian Metze; Alan W Black },
publisher = {IEEE SigPort},
title = {SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION},
year = {2018} }
TY - EJOUR
T1 - SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION
AU - Siddharth Dalmia; Ramon Sanabria; Florian Metze; Alan W Black
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2970
ER -
Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W Black. (2018). SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION. IEEE SigPort. http://sigport.org/2970
Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W Black, 2018. SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION. Available at: http://sigport.org/2970.
Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W Black. (2018). "SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION." Web.
1. Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W Black. SEQUENCE-BASED MULTI-LINGUAL LOW RESOURCE SPEECH RECOGNITION [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2970