Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2017 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics. Visit ICASSP 2017

Nonnegative matrix factorization (NMF) has recently been applied to temporal decomposition (TD) of speech spectral envelopes represented by line spectral frequencies. A couple of inherent TD constraints, which are otherwise handled as ad hoc exceptions, has also been incorporated using NMF, including LSF ordering and monotonic event functions. Here, these constraints are analyzed and a third inherent constraint is incorporated into an NMF analysis.

Categories:
19 Views

Spike and Slab priors have been of much recent interest in signal processing as a means of inducing sparsity in Bayesian inference. Applications domains that benefit from the use of these priors include sparse recovery, regression and classification. It is well-known that solving for the sparse coefficient vector to maximize these priors results in a hard non-convex and mixed integer programming problem. Most existing solutions to this optimization problem either involve simplifying assumptions/relaxations or are computationally expensive.

Categories:
13 Views

The achievable degree of freedom (DoF) boosting has been
demonstrated on a single-input single-output (SISO) X channel
by using outdated and instantaneous channel state information
at transmitter (CSIT) synergistically, in contrast to
that of using completely outdated CSIT. However, the means
by which the DoF gain can be obtained in a multiple-input
multiple-output (MIMO) system remains unclear. This paper
proposes an interference alignment scheme with synergistic
CSIT for MIMO X channel. We show that the achievable

Categories:
17 Views

Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU). However, the dynamic properties behind the remarkable performance remain unclear in many applications, e.g., automatic speech recognition (ASR). This paper employs visualization techniques to study the behavior of LSTM and GRU when performing speech recognition tasks.

Categories:
11 Views

The universal speech attributes for speaker verification (SV)
are addressed in this paper. The aim of this work is to
exploit fundamental characteristics across different speakers
within the deep neural network (DNN)/i-vector framework.
The manner and place of articulation form the fundamental
speech attribute unit inventory, and new attribute units for
acoustic modelling are generated by a two-step automatic
clustering method in this paper. The DNN based on
universal attribute units is used to generate posterior

Categories:
4 Views

Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.

Categories:
5 Views

Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.

Categories:
20 Views

Segment-tree (ST) based cost aggregation algorithm for stereo matching successfully integrates the information of segmentation with non-local cost aggregation framework. The tree structure which is generated by the segmentation strategy directly determines the final results for this kind of algorithms. However, the original strategy performs unrea-sonable due to its coarse performance and ignores to meet the disparity consistency assumption.

Categories:
21 Views

Pages