Sorry, you need to enable JavaScript to visit this website.

Coherent processing of signals captured by a wireless acoustic sensor network (WASN) requires an estimation of such parameters as the sampling-rate and sampling-time offset (SRO and STO). The acquired asynchronous signals of such WASN exhibit an accumulating time drift (ATD) linearly growing with time and dependent on SRO and STO values. In our demonstration, we present a real WASN based on Respberry-Pi computers, where SRO and ATD values are estimated by using a double-cross-correlation processor with phase transfrom (DXCP-PhaT) recently proposed.

Categories:
125 Views

While end-to-end models have shown great success on the Automatic Speech Recognition task, performance degrades severely when target sentences are long-form. The previous proposed methods, (partial) overlapping inference are shown to be effective on long-form decoding. For both methods, word error rate (WER) decreases monotonically when over- lapping percentage decreases. Setting aside computational cost, the setup with 50% overlapping during inference can achieve the best performance. However, a lower overlapping percentage has an advantage of fast inference speed.

Categories:
23 Views

Various attention mechanisms are being widely applied to acoustic scene classification. However, we empirically found that the attention mechanism can excessively discard potentially valuable information, despite improving performance. We propose the attentive max feature map that combines two effective techniques, attention and a max feature map, to further elaborate the attention mechanism and mitigate the above-mentioned phenomenon. We also explore various joint training methods, including multi-task learning, that allocate additional abstract labels for each audio recording.

Categories:
13 Views

We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models for cross-domain knowledge transfer, to address acoustic mismatches between training and testing conditions. Instead of carrying out point estimation in conventional maximum a posteriori estimation with a risk of having a curse of dimensionality in estimating a huge number of model parameters, we focus our attention on estimating a manageable number of latent variables of DNNs via a VB inference framework.

Categories:
7 Views

Pages