Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

To reduce the residual energy of a video signal, motion compensated prediction with fractional-sample accuracy has been successfully employed in modern video coding technology. In contrast to the fixed quarter-sample motion vector resolution for the luma component in High Efficiency Video Coding standard, the current draft of a new Versatile Video Coding standard introduces a block-level adaptive motion vector resolution (AMVR) scheme. The AMVR allows coding of motion vector difference at different precisions.

Categories:
27 Views

Graph signal processing on directed graphs poses theoretical challenges since an eigendecomposition of filters is in general not available. Instead, Fourier analysis requires a Jordan decomposition and the frequency response is given by the Jordan normal form, whose computation is numerically unstable for large sizes. In this paper, we propose to replace a given adjacency shift A by a diagonalizable shift A D obtained via the Jordan-Chevalley decomposition.

Categories:
34 Views

Dynamic dataflow models of computation have become widely used through their adoption to popular programming frameworks such as TensorFlow and GNU Radio. Although dynamic dataflow models offer more programming freedom, they lack analyzability compared to their static counterparts (such as synchronous dataflow). In this paper we advocate the use of a boundedly dynamic dataflow model of computation, VR-PRUNE, that remains analyzable but still offers more programming freedom than a fully static dataflow model.

Categories:
23 Views

Hyperspectral image (HSI) denoising is of crucial importance for many subsequent applications, such as HSI classification and interpretation. In this paper, we propose an attention-based deep residual network to directly learn a mapping from noisy HSI to the clean one. To jointly utilize the spatial-spectral information, the current band and its K adjacent bands are simultaneously exploited as the input.

Categories:
23 Views

Nonlinear spectral mapping-based models based on supervised learning have successfully applied for speech enhancement. However, as supervised learning approaches, a large amount of labelled data (noisy-clean speech pairs) should be provided to train those models. In addition, their performances for unseen noisy conditions are not guaranteed, which is a common weak point of supervised learning approaches. In this study, we proposed an unsupervised learning approach for speech enhancement, i.e., denoising autoencoder with linear regression decoder (DAELD) model for speech enhancement.

Categories:
52 Views

Voice conversion (VC) is a task that alters the voice of a person to suit different styles while conserving the linguistic content. Previous state-of-the-art technology used in VC was based on the sequence-to-sequence (seq2seq) model, which could lose linguistic information. There was an attempt to overcome this problem using textual supervision; however, this required explicit alignment, and therefore the benefit of using seq2seq model was lost. In this study, a voice converter that utilizes multitask learning with text-to-speech (TTS) is presented.

Categories:
31 Views

Pages