Documents
Presentation Slides
Linearly Augmented Deep Neural Network
- Citation Author(s):
- Submitted by:
- Pegah Ghahremani
- Last updated:
- 30 April 2016 - 7:54pm
- Document Type:
- Presentation Slides
- Document Year:
- 2016
- Event:
- Presenters:
- Jasha Droppo
- Paper Code:
- SP-L5.6
- Categories:
- Log in to post comments
Deep neural networks (DNN) are a powerful tool for many large vocabulary continuous speech recognition (LVCSR) tasks. Training a very deep network is a challenging problem and pre-training techniques are needed in order to achieve the best results. In this paper, we propose a new type of network architecture, Linear Augmented Deep Neural Network (LA-DNN). This type of network augments each non-linear layer with a linear connection from layer input to layer output. The resulting LA-DNN model eliminates the need for pre-training, addresses the gradient vanishing problem for deep net- works, has higher capacity in modeling linear transformations, trains significantly faster than normal DNN, and produces better acoustic models. The proposed model has been evaluated on TIMIT phoneme recognition and AMI speech recognition tasks. Experimental results show that the LA-DNN models can have 70% fewer parameters than a DNN, while still improving accuracy. On the TIMIT phoneme recognition task, the smaller LA-DNN model improves TIMIT phone accuracy by 2% absolute, and AMI word accuracy by 1.7% absolute.