Sorry, you need to enable JavaScript to visit this website.

OUT-OF-VOCABULARY WORD RECOVERY USING FST-BASED SUBWORD UNIT CLUSTERING IN A HYBRID ASR SYSTEM - poster for ICASSP 2018

Citation Author(s):
Ekaterina Egorova, Ekaterina Egorova
Submitted by:
Ekaterina Egorova
Last updated:
24 April 2018 - 10:23am
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
Ekaterina Egorova
Paper Code:
4076
 

The paper presents a new approach to extracting useful information from out-of-vocabulary (OOV) speech regions in ASR system output. The system makes use of a hybrid decoding network with both words and sub-word units. In the decoded lattices, candidates for OOV regions are identified
as sub-graphs of sub-word units. To facilitate OOV word recovery, we search for recurring OOVs by clustering the detected candidate OOVs. The metrics for clustering is based on a comparison of the sub-graphs corresponding to the OOV candidates. The proposed method discovers repeating out-of-vocabulary words and finds their graphemic representation more robustly than more conventional techniques taking into account only one best sub-word string hypotheses.

up
0 users have voted: