Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2016 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics.

We propose sparse reconstruction techniques to improve the quality and / or reduce the bit-rate of standard speech coders. To that end, we assume signal sparsity in some transform domain and formulate the problem of reconstructing the original signal in terms of constrained l1-norm minimization. We use modern primal-dual methods in order to solve the resulting non-smooth convex optimization problem. Experiments show that with the proposed sparse reconstruction method the instrumentally predicted speech quality can be largely improved.

Categories:
9 Views

Large-scale antenna (LSA) has gained a lot of attention recently since it can significantly improve
the performance of wireless systems. Similar to multiple-input multiple-output (MIMO) orthogonal
frequency division multiplexing (OFDM) or MIMO-OFDM, LSA can be also combined with OFDM to
deal with frequency selectivity in wireless channels. However, such combination suffers from substantially
increased complexity proportional to the number of antennas in LSA systems. For the conventional

Categories:
4 Views

Recent studies have been revisiting whole words as the basic modelling unit in speech recognition and query applications, instead of phonetic units. Such whole-word segmental systems rely on a function that maps a variable-length speech segment to a vector in a fixed-dimensional space; the resulting acoustic word embeddings need to allow for accurate discrimination between different word types, directly in the embedding space. We compare several old and new approaches in a word discrimination task.

Categories:
3 Views

In the past 5 years significant advances in Large Vocabulary Speech Recognition (LVSR), Deep Learning (DL) and Spoken Language Understanding (SLU), along with the explosive growth of wireless network bandwidth have given rise to three compelling Conversational AI agents that are available on the Andriod, iOS and Microsoft Smartphones. Conversational AI agents such as Google Now, Apple Siri and Microsoft Cortana are now the most preferred way of mobile web search and to perform command and control of the various smartphone apps.

Categories:
117 Views

In the last decade, it was shown that it is possible to reconstruct signals with finite rate of innovation (FRI signals) from the samples of their filtered versions. However, when noise is present, the present reconstruction algorithms tend to be low accuracy. In this work, a new sparsity-based reconstruction method for FRI signals is put forward. The streams of Diracs and exponential reproducing kernel are considered. Firstly, the analog time axis is quantified and aligned to grids.

Categories:
18 Views

Ever since the deep neural network (DNN)-based acoustic model appeared, the recognition performance of automatic peech recognition has been greatly improved. Due to this achievement, various researches on DNN-based technique for noise robustness are also in progress. Among these approaches, the noise-aware training (NAT) technique which aims to improve the inherent robustness of DNN using noise estimates has shown remarkable performance. However, despite the great performance, we cannot be certain whether NAT is an optimal method for sufficiently utilizing the inherent robustness of DNN.

Categories:
27 Views

Spherical harmonics root-MUSIC (MUltiple SIgnal Classification) technique for source localization using spherical microphone array is presented in this paper. Earlier work on root-MUSIC is limited to linear and planar arrays. Root-MUSIC for planar array utilizes the concept of manifold separation and beamspace transformation. In this paper, the Vandermonde structure of array manifold for a particular order is proved. Hence, the validity of root-MUSIC in the spherical harmonics domain is confirmed. The proposed method is evaluated by using simulated experiments on source localization.

Categories:
11 Views

We address the problem of terrain-scattered jammer suppression in multiple-input multiple-output (MIMO) radar using space-(fast) time adaptive processing (SFTAP). The correlation function of jamming components after matched filtering at the receiving end of MIMO radar is derived, and its relationship to the correlation matrix of the transmitted waveforms is established. This correlation function serves as a theoretical measure of evaluating the matched filtering effect on the received jamming signals.

Categories:
18 Views

Pages