Sorry, you need to enable JavaScript to visit this website.

The purpose of this study is to detect the mismatch between text script and voice-over. For this, we present a novel utterance verification (UV) method, which calculates the degree of correspondence between a voice-over and the phoneme sequence of a script. We found that the phoneme recognition probabilities of exaggerated voice-overs decrease compared to ordinary utterances, but their rankings do not demonstrate any significant change.

Categories:
24 Views

We introduce a novel technique for the automatic detection of word boundaries within continuous sentence expressions in Japanese Sign Language from three-dimensional body joint positions. First, the flow of signed sentence data within a temporal neighborhood is determined utilizing the spatial correlations between line segments of inter-joint pairs. Next, a frame-wise binary random forest classifier is trained to distinguish word and non-word frame content based on the extracted spatio-temporal features.

Categories:
35 Views

Recurrent neural networks have become increasingly popular for the task of language modeling achieving impressive gains in state-of-the-art speech recognition and natural language processing (NLP) tasks. Recurrent models exploit word dependencies over a much longer context window (as retained by the history states) than what is feasible with n-gram language models.

Categories:
92 Views