Sorry, you need to enable JavaScript to visit this website.

IMPROVING THE PERFORMANCE OF TRANSFORMER BASED LOW RESOURCE SPEECH RECOGNITION FOR INDIAN LANGUAGES

Citation Author(s):
Vishwas M. Shetty, Metilda Sagaya Mary N J, S. Umesh
Submitted by:
Vishwas Shetty
Last updated:
19 May 2020 - 3:23am
Document Type:
Presentation Slides
Document Year:
2020
Event:
Presenters:
Vishwas M Shetty
Paper Code:
3309
 

The recent success of the Transformer based sequence-to-sequence framework for various Natural Language Processing tasks has motivated its application to Automatic Speech Recognition. In this work, we explore the application of Transformers on low resource Indian languages in a multilingual framework. We explore various methods to incorporate language information into a multilingual Transformer, i.e.,(i) at the decoder, (ii) at the encoder. These methods include using language identity tokens or providing language information to the acoustic vectors. Language information to the acoustic vectors can be given in the form of one hot vector or by learning a language embedding. From our experiments, we observed that providing language identity always improved performance. The language embedding learned from our proposed approach, when added to the acoustic feature vector, gave the best result. The proposed approach with retraining gave 6% - 11% relative improvements in character error rates over the monolingual baseline.

up
0 users have voted: