ICASSP 2017

ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2017 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics. Visit ICASSP 2017

NON-NEGATIVE TEMPORAL DECOMPOSITION REGULARIZATION WITH AN AUGMENTED LAGRANGIAN

Read more about NON-NEGATIVE TEMPORAL DECOMPOSITION REGULARIZATION WITH AN AUGMENTED LAGRANGIAN
Log in to post comments

Nonnegative matrix factorization (NMF) has recently been applied to temporal decomposition (TD) of speech spectral envelopes represented by line spectral frequencies. A couple of inherent TD constraints, which are otherwise handled as ad hoc exceptions, has also been incorporated using NMF, including LSF ordering and monotonic event functions. Here, these constraints are analyzed and a third inherent constraint is incorporated into an NMF analysis.

ica2017ntdalmposter.pdf

ICASSP2017marjonaramirezSP-P1.10 (334)

Categories:: Speech Analysis (SPE-ANLS)
Speech Coding (SPE-CODI)

19 Views

ADAPTIVE MATCHING PURSUIT FOR SPARSE SIGNAL RECOVERY

Read more about ADAPTIVE MATCHING PURSUIT FOR SPARSE SIGNAL RECOVERY
Log in to post comments

Spike and Slab priors have been of much recent interest in signal processing as a means of inducing sparsity in Bayesian inference. Applications domains that benefit from the use of these priors include sparse recovery, regression and classification. It is well-known that solving for the sparse coefficient vector to maximize these priors results in a hard non-convex and mixed integer programming problem. Most existing solutions to this optimization problem either involve simplifying assumptions/relaxations or are computationally expensive.

Poster_ICASSP_2017_AMP.pdf

Poster_ICASSP_2017_AMP.pdf (579)

Categories:: Adaptive Signal Processing

13 Views

Interfeference Alignment on MIMO X channel with Synergistic CSIT

Read more about Interfeference Alignment on MIMO X channel with Synergistic CSIT
Log in to post comments

The achievable degree of freedom (DoF) boosting has been
demonstrated on a single-input single-output (SISO) X channel
by using outdated and instantaneous channel state information
at transmitter (CSIT) synergistically, in contrast to
that of using completely outdated CSIT. However, the means
by which the DoF gain can be obtained in a multiple-input
multiple-output (MIMO) system remains unclear. This paper
proposes an interference alignment scheme with synergistic
CSIT for MIMO X channel. We show that the achievable

ICASSP17Poster.pdf

Poster (630)

ICASSP17.pdf

Paper (648)

Categories:: Multi-antenna and Multi-channel Signal Processing for Communications

17 Views

QUALITY ESTIMATION BASED MULTI-FOCUS IMAGE FUSION

Read more about QUALITY ESTIMATION BASED MULTI-FOCUS IMAGE FUSION
Log in to post comments

Poster_2182.pdf

Poster_paperID2182 (564)

Categories:: Image/Video Processing

2 Views

MEMORY VISUALIZATION FOR GATED RECURRENT NEURAL NETWORKS IN SPEECH RECOGNITION

Read more about MEMORY VISUALIZATION FOR GATED RECURRENT NEURAL NETWORKS IN SPEECH RECOGNITION
Log in to post comments

Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU). However, the dynamic properties behind the remarkable performance remain unclear in many applications, e.g., automatic speech recognition (ASR). This paper employs visualization techniques to study the behavior of LSTM and GRU when performing speech recognition tasks.

icassp17_visual.pdf

icassp17_visual.pdf (671)

Categories:: Neural network learning (MLR-NNLR)
Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

11 Views

EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION

Read more about EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION
Log in to post comments

The universal speech attributes for speaker verification (SV)
are addressed in this paper. The aim of this work is to
exploit fundamental characteristics across different speakers
within the deep neural network (DNN)/i-vector framework.
The manner and place of articulation form the fundamental
speech attribute unit inventory, and new attribute units for
acoustic modelling are generated by a two-step automatic
clustering method in this paper. The DNN based on
universal attribute units is used to generate posterior

ICASSP2017_shengzhang_v2.pdf

ICASSP2017_shengzhang_v2.pdf (619)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

4 Views

FEATURE MAPPING FOR SPEAKER DIARIZATION IN NOISY CONDITIONS

Read more about FEATURE MAPPING FOR SPEAKER DIARIZATION IN NOISY CONDITIONS
Log in to post comments

Speaker diarization in noisy conditions is addressed in this paper. The regression-based DNN is first adopted to map the noisy acoustic features to the clean features, and then consensus clustering of the original and mapped features is used to fuse the diarization results. The experiments are conducted on the IFLY-DIAR-II database, which is a Chinese talk show database with various noise types, such as music, applause and laughter. Compared to the baseline system using PLP features, a 21.26% relative DER improvement can be achieved using the proposed algorithm.

poster_wxzhu_v2.pptx

poster_wxzhu_v2.pptx (670)

Categories:: Audio and Acoustic Signal Processing

5 Views

FEATURE MAPPING FOR SPEAKER DIARIZATION IN NOISY CONDITIONS

Read more about FEATURE MAPPING FOR SPEAKER DIARIZATION IN NOISY CONDITIONS
Log in to post comments

poster_wxzhu_v2.pptx

poster_wxzhu_v2.pptx (670)

Categories:: Audio and Acoustic Signal Processing

20 Views

Segment-Tree Based Cost Aggregation for Stereo Matching with Enhanced Segmentation Advantage

Segment-tree (ST) based cost aggregation algorithm for stereo matching successfully integrates the information of segmentation with non-local cost aggregation framework. The tree structure which is generated by the segmentation strategy directly determines the final results for this kind of algorithms. However, the original strategy performs unrea-sonable due to its coarse performance and ignores to meet the disparity consistency assumption.