Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

On the use of grapheme models for searching in large spoken archives

Abstract: 

This paper explores the possibility to use grapheme-based word and sub-word models in the task of spoken term detection (STD). The usage of grapheme models eliminates the need for expert-prepared pronunciation lexicons (which are often far from complete) and/or trainable grapheme-to-phoneme (G2P) algorithms that are frequently rather inaccurate, especially for rare words (words coming from a~different language). Moreover, the G2P conversion of the search terms that need to be performed on-line can substantially increase the response time of the STD system. Our results show that using various grapheme-based models, we can achieve STD performance (measured in terms of ATWV) comparable with phoneme-based models but without the additional burden of G2P conversion.

up
0 users have voted:

Paper Details

Authors:
Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky
Submitted On:
13 April 2018 - 2:55am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Jan Švec
Paper Code:
HLT-P4.7
Document Year:
2018
Cite

Document Files

poster.pdf

(33 downloads)

Subscribe

[1] Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky, "On the use of grapheme models for searching in large spoken archives", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2622. Accessed: Jun. 22, 2018.
@article{2622-18,
url = {http://sigport.org/2622},
author = {Jan Švec; Josef V. Psutka; Jan Trmal; Luboš Šmídl; Pavel Ircing; Jan Sedmidubsky },
publisher = {IEEE SigPort},
title = {On the use of grapheme models for searching in large spoken archives},
year = {2018} }
TY - EJOUR
T1 - On the use of grapheme models for searching in large spoken archives
AU - Jan Švec; Josef V. Psutka; Jan Trmal; Luboš Šmídl; Pavel Ircing; Jan Sedmidubsky
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2622
ER -
Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky. (2018). On the use of grapheme models for searching in large spoken archives. IEEE SigPort. http://sigport.org/2622
Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky, 2018. On the use of grapheme models for searching in large spoken archives. Available at: http://sigport.org/2622.
Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky. (2018). "On the use of grapheme models for searching in large spoken archives." Web.
1. Jan Švec, Josef V. Psutka, Jan Trmal, Luboš Šmídl, Pavel Ircing, Jan Sedmidubsky. On the use of grapheme models for searching in large spoken archives [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2622