Sorry, you need to enable JavaScript to visit this website.

Statistical Normalisation of Phase-based Feature Representation for Robust Speech Recognition

Citation Author(s):
Erfan Loweimi, Jon Barker, Thomas Hain
Submitted by:
Erfan Loweimi
Last updated:
27 February 2017 - 3:08pm
Document Type:
Poster
Document Year:
2017
Event:
Paper Code:
1809
 

In earlier work we have proposed a source-filter decomposition of
speech through phase-based processing. The decomposition leads
to novel speech features that are extracted from the filter component
of the phase spectrum. This paper analyses this spectrum and the
proposed representation by evaluating statistical properties at vari-
ous points along the parametrisation pipeline. We show that speech
phase spectrum has a bell-shaped distribution which is in contrast to
the uniform assumption that is usually made. It is demonstrated that
the uniform density (which implies that the corresponding sequence
is least-informative) is an artefact of the phase wrapping and not an
original characteristic of this spectrum. In addition, we extend the
idea of statistical normalisation usually applied for the magnitude-
based features into the phase domain. Based on the statistical struc-
ture of the phase-based features, which is shown to be super-gaussian
in the clean condition, three normalisation schemes, namely, Gaus-
sianisation, Laplacianisation and table-based histogram equalisation
have been applied for improving the robustness. Speech recognition
experiments using Aurora-2 show that applying an optimal normali-
sation scheme at the right stage of the feature extraction process can
produce average relative WER reductions of up to 18.6% across the
0-20 dB SNR conditions

up
1 user has voted: Saeid Mokaram