Sorry, you need to enable JavaScript to visit this website.

AN ATTENTION MODEL FOR HYPERNASALITY PREDICTION IN CHILDREN WITH CLEFT PALATE

Citation Author(s):
Vikram C Mathad, Nancy Scherer, Kathy Chapman, Julie Liss, and Visar Berisha
Submitted by:
Vikram C Mathad
Last updated:
28 June 2021 - 1:34am
Document Type:
Poster
Document Year:
2021
Event:
Presenters:
Vikram C Mathad
Paper Code:
SPE-56.2
 

Hypernasality refers to the perception of abnormal nasal resonances in vowels and voiced consonants. Estimation of hypernasality severity from connected speech samples involves learning a mapping between the frame-level features and utterance-level clinical ratings of hypernasality. However, not all speech frames contribute equally to the perception of hypernasality. In this work, we propose an attention-based bidirectional long-short memory (BLSTM) model that directly maps the frame-level features to utterance-level ratings by focusing only on specific speech frames carrying hypernasal cues. The model’s performance is evaluated on the Americleft database containing speech samples of children with cleft palate and clinical ratings of hypernasality. We analyzed the attention weights over broad phonetic categories and found that the model yields results consistent with what is known in the speech science literature. Further, the correlation between the predicted and perceptual rating is found to be significant ($r=0.684$, $p < 0.001$) and better than conventional BLSTMs trained using frame-wise and last-frame approaches.

up
0 users have voted: