ICASSP 2018

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

CLASSIFICATION OF CORALS IN REFLECTANCE AND FLUORESCENCE IMAGES USING CONVOLUTIONAL NEURAL NETWORK REPRESENTATIONS

Coral species, with complex morphology and ambiguous boundaries, pose a great challenge for automated classification. CNN activations, which are extracted from fully connected layers of deep networks (FC features), have been successfully used as powerful universal representations in many visual tasks. In this paper, we investigate the transferability and combined performance of FC features and CONV features (extracted

ICASSP2018 Poster.pdf

ICASSP2018 Poster.pdf (372)

Categories:: Audio and Acoustic Signal Processing

40 Views

VR IQA NET: Deep Virtual Reality Image Quality Assessment using Adversarial Learning

Read more about VR IQA NET: Deep Virtual Reality Image Quality Assessment using Adversarial Learning
Log in to post comments

In this paper, we propose a novel virtual reality image quality assessment (VR IQA) with adversarial learning for omnidirectional images. To take into account the characteristics of the omnidirectional image, we devise deep networks including novel quality score predictor and human perception guider. The proposed quality score predictor automatically predicts the quality score of distorted image using the latent spatial and position feature.

VR-IQA-NET_ICASSP_ppt.pdf

VR IQA NET-ICASSP2018 (607)

Categories:: Quality Assessment
Virtual reality and 3D imaging
Neural network learning (MLR-NNLR)

25 Views

DEEP FACTORIZATION FOR SPEECH SIGNAL

Read more about DEEP FACTORIZATION FOR SPEECH SIGNAL
Log in to post comments

Various informative factors mixed in speech signals, leading to great difficulty when decoding any of the factors. An intuitive idea is to factorize each speech frame into individual informative factors, though it turns out to be highly difficult. Recently, we found that speaker traits, which were assumed to be long-term distributional properties, are actually short-time patterns, and can be learned by a carefully designed deep neural network (DNN). This discovery motivated a cascade deep factorization (CDF) framework that will be presented in this paper.

180417-deepFactor-LLT.pptx

180417-deepFactor-LLT.pptx (441)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

20 Views

FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING

Read more about FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING
Log in to post comments

In recent studies, it has shown that speaker patterns can be learned from very short speech segments (e.g., 0.3 seconds) by a carefully designed convolutional & time-delay deep neural network (CT-DNN) model. By enforcing the model to discriminate the speakers in the training data, frame-level speaker features can be derived from the last hidden layer.

180418-Full_info-LLT.pptx

180418-Full_info-LLT.pptx (460)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

6 Views

High-speed Optical Camera Communication Using an Optimally Modulated Signal

Read more about High-speed Optical Camera Communication Using an Optimally Modulated Signal
Log in to post comments

This paper describes a high-speed optical camera communication (OCC) technique using an LED and a rolling-shutter camera. In the proposed technique, the symbols being transmitted are encoded as time delays of optimally modulated signals derived theoretically. A receiver decodes the symbols by using intensities obtained from four consecutive line sensors of a camera.

skah-icassp2018-poster.pdf

skah-icassp2018-poster.pdf (535)

Categories:: Communications and Networking

14 Views

Deep Feature Embedding Learning for Person Re-Identification Using Lifted Structured Loss

In this paper, we propose deep feature embedding learning for person re-identification (re-id) using lifted structured loss. Although triplet loss has been commonly used in deep neural networks for person re-id, the triplet loss-based framework is not effective in fully using the batch information. Thus, it needs to choose hard negative samples manually that is very time-consuming. To address this problem, we adopt lifted structured loss for deep neural networks that makes the network learn better feature embedding by minimizing intra-class variation and maximizing inter-class variation.

ICASSP2018_PersonReID_final.pdf

ICASSP2018_PersonReID_final.pdf (604)

Categories:: Applications

76 Views

Determined Blind Source Separation via Proximal Splitting Algorithm

Read more about Determined Blind Source Separation via Proximal Splitting Algorithm
Log in to post comments

The state-of-the-art algorithms of determined blind source separation (BSS) methods based on the independent component analysis

2018.04.20_ICASSPポスター_PDS_ICA.pdf

2018.04.20_ICASSPポスター_PDS_ICA.pdf (366)

Categories:: Audio and Acoustic Signal Processing

35 Views

EXTENDABLE NEURAL MATRIX COMPLETION

Read more about EXTENDABLE NEURAL MATRIX COMPLETION
Log in to post comments

ICASSP-MC-poster.pdf

ICASSP-MC-poster.pdf (522)

Categories:: Emerging: Big Data

17 Views

Phase Corrected Total Variation for Audio Signals

Read more about Phase Corrected Total Variation for Audio Signals
Log in to post comments

In optimization-based signal processing, the so-called prior term models the desired signal, and therefore its design is the key factor to achieve a good performance. For audio signals, the time-directional total variation applied to a spectrogram in combination with phase correction has been proposed recently to model sinusoidal components of the signal. Although it is a promising prior, its applicability might be restricted to some extent because of the mismatch of the assumption to the signal.

2018.04.20_ICASSPポスター_iPC_TV.pdf

2018.04.20_ICASSPポスター_iPC_TV.pdf (392)

Categories:: Audio and Acoustic Signal Processing

52 Views

SEQUENTIAL INFERENCE METHODS FOR NON-HOMGENEOUS POISSON PROCESSES WITH STATE-SPACE PRIOR

The non-homogeneous Poisson process (NHPP) is a point process with time-varying intensity across its domain, the use of which arises in numerous domains in signal processing, machine learning and many other fields. However, its applications are largely limited by the intractable likelihood and the high computational cost of existing inference schemes. We present an online inference framework that utilises generative Poisson data and sequential Markov Chain Monte Carlo (SMCMC) algorithm, which achieves improved performance in both synthetic and real datasets.

icassp-poster.pdf

Sequential Methods for Non-homogeneous Poisson Intensity Inference (619)

Categories:: Statistical Signal Processing

31 Views

Pages