Documents
Poster
End-to-end Keywords Spotting Based on Connectionist Temporal Classification for Mandarin
- Citation Author(s):
- Submitted by:
- Ye Bai
- Last updated:
- 14 October 2016 - 4:44am
- Document Type:
- Poster
- Document Year:
- 2016
- Event:
- Presenters:
- Ye Bai
- Categories:
- Log in to post comments
Traditional hybrid DNN-HMM based ASR system for keywords spotting which models HMM states are not flexible to optimize for a specific language. In this paper, we construct an end-to-end acoustic model based ASR for keywords spotting in Mandarin. This model is constructed by LSTM-RNN and trained with objective measure of connectionist temporal classification. The input of the network is feature sequences, and the output the probabilities of the initials and finals of Mandarin syllables. Compared with hybrid based ASR systems, the end-to-end system achieves a significant improvement of 6.32% on ATWV relatively. The best result of our system is ATWV 0.8310 on RASC863 data set. The proposed CTC based method applies to KWS in a specific language.