Sorry, you need to enable JavaScript to visit this website.

FILTERBANK LEARNING USING CONVOLUTIONAL RESTRICTED BOLTZMANN MACHINE FOR SPEECH RECOGNITION

Citation Author(s):
Hardik B. Sailor, Hemant A. Patil
Submitted by:
Hardik Bhupendrabhai
Last updated:
31 March 2016 - 4:04am
Document Type:
Poster
Document Year:
2016
Event:
Presenters:
Hardik Sailor
Paper Code:
2705
 

Convolutional Restricted Boltzmann Machine (ConvRBM) as a model for speech signal is presented in this paper. We have
developed ConvRBM with sampling from noisy rectified linear units (NReLUs). ConvRBM is trained in an unsupervised way to model speech signal of arbitrary lengths. Weights of the model can represent an auditory-like filterbank. Our
proposed learned filterbank is also nonlinear with respect to center frequencies of subband filters similar to standard filterbanks(such as Mel, Bark, ERB, etc.). We have used our proposed model as a front-end to learn features and applied to speech recognition task. Performance of ConvRBM features is improved compared to MFCC with relative improvement of 5% on TIMIT test set and 7% on WSJ0 database for both Nov’92 test sets using GMM-HMM systems. With DNNHMM systems, we achieved relative improvement of 3% on TIMIT test set over MFCC and Mel filterbank (FBANK). On WSJ0 Nov’92 test sets, we achieved relative improvement of 4-14% using ConvRBM features over MFCC features and 3.6-5.6% using ConvRBM filterbank over FBANK features.

up
0 users have voted: