Sorry, you need to enable JavaScript to visit this website.

On the use of grapheme models for searching in large spoken archives

Citation Author(s):
Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky
Submitted by:
Jan Svec
Last updated:
13 April 2018 - 2:55am
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
Jan Švec
Paper Code:
HLT-P4.7
 

This paper explores the possibility to use grapheme-based word and sub-word models in the task of spoken term detection (STD). The usage of grapheme models eliminates the need for expert-prepared pronunciation lexicons (which are often far from complete) and/or trainable grapheme-to-phoneme (G2P) algorithms that are frequently rather inaccurate, especially for rare words (words coming from a~different language). Moreover, the G2P conversion of the search terms that need to be performed on-line can substantially increase the response time of the STD system. Our results show that using various grapheme-based models, we can achieve STD performance (measured in terms of ATWV) comparable with phoneme-based models but without the additional burden of G2P conversion.

up
0 users have voted: