LINEAR PREDICTION-BASED PART-DEFINED AUTO-ENCODER USED FOR SPEECH ENHANCEMENT

This paper proposes a linear prediction-based part-defined auto-encoder (PAE) network to enhance speech signal. The PAE is a defined decoder or an established encoder network, based on an efficient learning algorithm or classical model. In this paper, the PAE utilizes AR-Wiener filter as the decoder part, and the AR-Wiener filter is modified as a linear prediction (LP) model by incorporating the modified factor from the residual signal. The parameters of line spectral frequency (LSF) of speech and noise and the Wiener filtering mask are utilized for training targets. Finally, the proposed LP-based PAE is compared with the baseline method, namely the Wiener filtering mask-based DNN. The PESQ and STOI results of the LP-based PAE are better than the baseline method at lower signal noise ratio (SNR) levels.

Documents

Poster

LINEAR PREDICTION-BASED PART-DEFINED AUTO-ENCODER USED FOR SPEECH ENHANCEMENT

Poster_ICASSP_2019_LPPAE_zihao.pdf

QUESTIONS?