Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM

Abstract: 

The increasing profusion of commercial automatic speech recognition technology applications has been driven by big-data techniques, making use of high quality labelled speech datasets. Children’s speech displays greater time and frequency domain variability than typical adult speech, lacks the depth and breadth of training material, and presents difficulties relating to capture quality. All of these factors act to reduce the achievable performance of systems that recognise children’s speech. In this paper, children’s speech recognition is investigated using a hybrid acoustic modelling approach based on deep neural networks and Gaussian mixture models with hidden Markov model back ends. We explore the incorporation of mismatched training data to achieve a better acoustic model and improve performance in the face of limited training data, as well as training data augmentation using noise. We also explore two arrangements for vocal tract length normalisation and a gender-based data selection technique suitable for training a children’s speech recogniser.

up
0 users have voted:

Paper Details

Authors:
Ian McLoughlin, Wu Guo, Lirong Dai
Submitted On:
14 October 2016 - 5:48am
Short Link:
Type:
Poster
Event:
Document Year:
2016
Cite

Document Files

ISCSLP_poster(MengjieQian) .pdf

(43)

Subscribe

[1] Ian McLoughlin, Wu Guo, Lirong Dai, "Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1186. Accessed: Feb. 18, 2020.
@article{1186-16,
url = {http://sigport.org/1186},
author = {Ian McLoughlin; Wu Guo; Lirong Dai },
publisher = {IEEE SigPort},
title = {Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM},
year = {2016} }
TY - EJOUR
T1 - Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM
AU - Ian McLoughlin; Wu Guo; Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1186
ER -
Ian McLoughlin, Wu Guo, Lirong Dai. (2016). Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM. IEEE SigPort. http://sigport.org/1186
Ian McLoughlin, Wu Guo, Lirong Dai, 2016. Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM. Available at: http://sigport.org/1186.
Ian McLoughlin, Wu Guo, Lirong Dai. (2016). "Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM." Web.
1. Ian McLoughlin, Wu Guo, Lirong Dai. Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1186