Sorry, you need to enable JavaScript to visit this website.

DNN-based Multi-Channel Speech Coding Employing Sound Localization

Citation Author(s):
Shuhao Deng, Changchun Bao
Submitted by:
Changchun Bao
Last updated:
20 February 2022 - 9:11pm
Document Type:
Presentation Slides
Document Year:
2022
Event:
Presenters:
Shuhao Deng
Paper Code:
106
 

In this paper, a novel multi-channel speech coding method based on deep neural networks (DNN) employing sound localization called time difference of arrival (TDOA) is proposed. At the encoder, only the speech signals of two reference channels and estimated TDOAs are coded. At the decoder, a well-trained DNN that builds the relationship of the amplitude spectra between two reference channels and other channels is embedded into the decoder for recovering the amplitude spectra of other channel signals from two decoded reference signals. Specially, the speech signals of two reference channels are encoded by the existing codec. The proposed multi-channel speech coding method has two advantages. Firstly, it can effectively reduce bit rates with an acceptable speech distortion. Secondly, it can keep the backward compatibility with the existing codec. Compared with some existing multi-channel speech coding approaches, the experimental results show that the proposed method has higher coding performance at lower bit rates.

up
0 users have voted: