ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2017 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics. Visit ICASSP 2017
- Read more about NON-NEGATIVE TEMPORAL DECOMPOSITION REGULARIZATION WITH AN AUGMENTED LAGRANGIAN
- Log in to post comments
Nonnegative matrix factorization (NMF) has recently been applied to temporal decomposition (TD) of speech spectral envelopes represented by line spectral frequencies. A couple of inherent TD constraints, which are otherwise handled as ad hoc exceptions, has also been incorporated using NMF, including LSF ordering and monotonic event functions. Here, these constraints are analyzed and a third inherent constraint is incorporated into an NMF analysis.
- Categories:
Spike and Slab priors have been of much recent interest in signal processing as a means of inducing sparsity in Bayesian inference. Applications domains that benefit from the use of these priors include sparse recovery, regression and classification. It is well-known that solving for the sparse coefficient vector to maximize these priors results in a hard non-convex and mixed integer programming problem. Most existing solutions to this optimization problem either involve simplifying assumptions/relaxations or are computationally expensive.
- Categories:
- Read more about Interfeference Alignment on MIMO X channel with Synergistic CSIT
- Log in to post comments
The achievable degree of freedom (DoF) boosting has been
demonstrated on a single-input single-output (SISO) X channel
by using outdated and instantaneous channel state information
at transmitter (CSIT) synergistically, in contrast to
that of using completely outdated CSIT. However, the means
by which the DoF gain can be obtained in a multiple-input
multiple-output (MIMO) system remains unclear. This paper
proposes an interference alignment scheme with synergistic
CSIT for MIMO X channel. We show that the achievable
ICASSP17Poster.pdf
ICASSP17.pdf
- Categories:
- Read more about QUALITY ESTIMATION BASED MULTI-FOCUS IMAGE FUSION
- Log in to post comments
- Categories:
- Read more about MEMORY VISUALIZATION FOR GATED RECURRENT NEURAL NETWORKS IN SPEECH RECOGNITION
- Log in to post comments
Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU). However, the dynamic properties behind the remarkable performance remain unclear in many applications, e.g., automatic speech recognition (ASR). This paper employs visualization techniques to study the behavior of LSTM and GRU when performing speech recognition tasks.
- Categories:
- Read more about EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION
- Log in to post comments
The universal speech attributes for speaker verification (SV)
are addressed in this paper. The aim of this work is to
exploit fundamental characteristics across different speakers
within the deep neural network (DNN)/i-vector framework.
The manner and place of articulation form the fundamental
speech attribute unit inventory, and new attribute units for
acoustic modelling are generated by a two-step automatic
clustering method in this paper. The DNN based on
universal attribute units is used to generate posterior
- Categories:
Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.
- Categories:
Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.
- Categories:
- Read more about Segment-Tree Based Cost Aggregation for Stereo Matching with Enhanced Segmentation Advantage
- Log in to post comments
Segment-tree (ST) based cost aggregation algorithm for stereo matching successfully integrates the information of segmentation with non-local cost aggregation framework. The tree structure which is generated by the segmentation strategy directly determines the final results for this kind of algorithms. However, the original strategy performs unrea-sonable due to its coarse performance and ignores to meet the disparity consistency assumption.
- Categories:
- Categories: