Sorry, you need to enable JavaScript to visit this website.

AN INVESTIGATION INTO INSTANTANEOUS FREQUENCY ESTIMATION METHODS FOR IMPROVED SPEECH RECOGNITION FEATURES

Citation Author(s):
Saurabhchand Bhati
Submitted by:
Shekhar Nayak
Last updated:
11 November 2017 - 8:10am
Document Type:
Poster
Document Year:
2017
Event:
Presenters:
Shekhar Nayak
Paper Code:
GS-SIPA-P.1.5
 

There have been several studies, in the recent past, pointing to the
importance of analytic phase of the speech signal in human percep-
tion, especially in noisy conditions. However, phase information is
still not used in state-of-the-art speech recognition systems. In this
paper, we illustrate the importance of analytic phase of the speech
signal for automatic speech recognition. As the computation of ana-
lytic phase suffers from inevitable phase wrapping problem, we ex-
tract features from its time derivative, referred to as instantaneous
frequency (IF). In this work, we highlight the issues involved in IF
extraction from speech-like signals, and propose suitable modifica-
tions for IF extraction from speech signals. We used the deep neural
network (DNN) framework to build a speech recognition system us-
ing features extracted from the IF of speech signals. The speech
recognition system based on IF features delivered a phoneme er-
ror rate of 21.8% on TIMIT database, while the baseline system
based on mel-frequency cepstral coefficients (MFCCs) delivered a
phoneme error rate of 18.4%. The combination of IF and MFCC fea-
tures based systems, using minimum Bayes risk (MBR) decoding,
provided a relative improvement of 8.7% over the baseline system,
illustrating the significance of analytic phase for speech recognition.

up
0 users have voted: