Documents
Presentation Slides
Exploiting Language-Mismatched Phoneme Recognizers for Unsupervised Acoustic Modeling
- Citation Author(s):
- Submitted by:
- Siyuan Feng
- Last updated:
- 13 October 2016 - 8:27am
- Document Type:
- Presentation Slides
- Document Year:
- 2016
- Event:
- Presenters:
- Siyuan Feng
- Paper Code:
- 118
- Categories:
- Log in to post comments
This paper describes an investigation on acoustic modeling in the absence of transcribed training data. We propose to use language-mismatched phoneme recognizers to assist unsupervised segmentation and segment clustering of a new language. Using a language-mismatched recognizer, an input utterance is divided into many variable-length segments. Each segment is represented by a feature vector that is derived from the phoneme posterior probabilities. A spectral clustering algorithm is developed to group the segments into a prescribed number of clusters, which represent a set of basic speech units in the target language. By exploiting multiple recognizers for different languages, a wider phonetic space can be covered, leading to improved performance of segmentation and clustering. Experimental results on a multilingual speech database confirm the effectiveness of the proposed method.