- Read more about Speech Dereverberation Based on Integrated Deep and Ensemble Learning Algorithm
- Log in to post comments
- Categories:
- Read more about Improved Noise Characterization for Relative Impulse Response Estimation
- Log in to post comments
Relative Impulse Responses (ReIRs) have several applications in speech enhancement, noise suppression and source localization for multi-channel speech processing in reverberant environments. Noise is usually assumed to be white Gaussian during the estimation of the ReIR between two microphones. We show that the noise in this system identification problem is instead dependent upon the microphone measurements and the ReIR itself.
ICASSP_V3.pdf
- Categories:
- Read more about INFRASONIC SCENE FINGERPRINTING FOR AUTHENTICATING SPEAKER LOCATION
- Log in to post comments
Ambient infrasound with frequency ranges well below 20 Hz is known to carry robust navigation cues that can be exploited to authenticate the location of a speaker. Unfortunately, many of the mobile devices like smartphones have been optimized to work in the human auditory range, thereby suppressing information in the infrasonic region. In this paper, we show that these ultra-low frequency cues can still be extracted from a standard smartphone recording by using acceleration-based cepstral features.
- Categories:
- Read more about CONFIDENCE MEASURES FOR CTC-BASED PHONE SYNCHRONOUS DECODING
- Log in to post comments
- Categories:
- Read more about RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR SPEECH PROCESSING
- Log in to post comments
Different neural networks have exhibited excellent performance on various speech processing tasks, and they usually have specific advantages and disadvantages. We propose to use a recently developed deep learning model, recurrent convolutional neural network (RCNN), for speech processing, which inherits some merits of recurrent neural network (RNN) and convolutional neural network (CNN). The core module can be viewed as a convolutional layer embedded with an RNN, which enables the model to capture both temporal and frequency dependence in the spectrogram of the speech in an efficient way.
- Categories:
- Read more about Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code
- Log in to post comments
Recently, the speaker code based adaptation has been successfully expanded to recurrent neural networks using bidirectional Long Short-Term Memory (BLSTM-RNN) [1]. Experiments on the small-scale TIMIT task have demonstrated that the speaker code based adaptation is also valid for BLSTM-RNN. In this paper, we evaluate this method on large-scale task and introduce an error normalization method to balance the back-propagation errors derived from different layers for speaker codes. Meanwhile, we use singular value decomposition (SVD) method to conduct model compression.
- Categories:
- Read more about ESTIMATION OF TDOA FOR ROOM REFLECTIONS BY ITERATIVE WEIGHTED L1 CONSTRAINT
- Log in to post comments
- Categories:
- Read more about ESTIMATION OF TDOA FOR ROOM REFLECTIONS BY ITERATIVE WEIGHTED L1 CONSTRAINT
- Log in to post comments
- Categories: