ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

CLEANING ADVERSARIAL PERTURBATIONS VIA RESIDUAL GENERATIVE NETWORK FOR FACE VERIFICATION

Deep neural networks (DNNs) have recently achieved impressive performances on various applications. However, recent researches show that DNNs are vulnerable to adversarial perturbations injected into input samples. In this paper, we investigate a defense method for face verification: a deep residual generative network (ResGN) is learned to clean adversarial perturbations. We propose a novel training framework composed of ResGN, pre-trained VGG-Face network and FaceNet network.

poster苏玉莹.pdf

poster苏玉莹.pdf (366)

Categories:: Biometrics

6 Views

Channel Estimation and Low-complexity Beamforming Design for Passive Intelligent Surface Assisted MISO Wireless Energy Transfer

Usage of passive intelligent surface (PIS) is emerging as a low-cost green alternative to massive antenna systems for realizing high energy beamforming (EB) gains. To maximize its realistic utility, we present a novel channel estimation (CE) protocol for PIS-assisted energy transfer (PET) from a multiantenna power beacon (PB) to a single-antenna energy harvesting (EH) user. Noting the practical limitations of PIS and EH user, all computations are carried out at PB having required active components and radio resources.

Poster_PIS_WET_ICASSP19.pdf

Poster_PIS_WET_ICASSP19.pdf (575)

Categories:: MIMO Communications and Signal Processing

134 Views

Sum Throughput Maximization For Multi-Tag MISO Backscattering

Read more about Sum Throughput Maximization For Multi-Tag MISO Backscattering
Log in to post comments

Backscatter communication (BSC) is emerging as the core technology for pervasive sustainable internet-of-things applications. However, owing to the resource-limitations of passive tags, this work targets at maximizing the achievable sum-backscattered-throughput by jointly optimizing the transceiver (TRX) design at the full-duplex multiantenna reader and backscattering coefficients (BC) at the single antenna tags.

Poster_SRM_BSC_ICASSP19.pdf

Poster_SRM_BSC_ICASSP19.pdf (409)

Categories:: Communication Systems and Applications

18 Views

A Novel Framework Of Hand Localization And Hand Pose Estimation

Read more about A Novel Framework Of Hand Localization And Hand Pose Estimation
Log in to post comments

In this paper, we propose a novel framework for hand localization and pose estimation from a single depth image. For hand localization, unlike most existing methods that using heuristic strategies, e.g. color segmentation, we propose Hierarchical Hand location Networks (HHLN) to estimate the hand location from coarse to fine in depth images, which is robust to the complex environment and efficient. It ﬁrst applied at a low resolution octree of the whole depth image and produce coarse hand region and then constructs the hand region into a high resolution octree for fine location estimation.

poster_cheyunlong.pdf

poster (349)

Categories:: Image/Video Processing

74 Views

Data-driven simulation using the nuclear norm heuristic

Read more about Data-driven simulation using the nuclear norm heuristic
Log in to post comments

Applications of signal processing and control are classically model-based, involving a two-step procedure for modeling and design: first a model is built from given data, and second, the estimated model is used for filtering, estimation, or control. Both steps typically involve optimization problems, but the combination of both is not necessarily optimal, and the modeling step often ignores the ultimate design objective. Recently, data-driven alternatives are receiving attention, which employ a direct approach combining the modeling and design into a single step.

presICASSP2019.pdf

presICASSP2019.pdf (374)

Categories:: Signal and System Modeling, Representation and Estimation

10 Views

Cycle-consistent adversarial networks for non-parallel vocal effort based speaking style conversion

Speaking style conversion (SSC) is the technology of converting natural speech signals from one style to another. In this study, we propose the use of cycle-consistent adversarial networks (CycleGANs) for converting styles with varying vocal effort, and focus on conversion between normal and Lombard styles as a case study of this problem. We propose a parametric approach that uses the Pulse Model in Log domain (PML) vocoder to extract speech features. These features are mapped using the CycleGAN from utterances in the source style to the corresponding features of target speech.

Seshadri_ICASSP2019_final.pdf

Seshadri_ICASSP2019_final.pdf (510)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

26 Views

COMPUTATIONAL COGNITIVE ASSESSMENT: INVESTIGATING THE USE OF AN INTELLIGENT VIRTUAL AGENT FOR THE DETECTION OF EARLY SIGNS OF DEMENTIA

The ageing population has caused a marked increased in the number of people with cognitive decline linked with dementia. Thus, current diagnostic services are overstretched, and there is an urgent need for automating parts of the assessment process. In previous work, we demonstrated how a stratification tool built around an Intelligent Virtual Agent (IVA) eliciting a conversation by asking memory-probing questions, was able to accurately distinguish between people with a neuro-degenerative disorder (ND) and a functional memory disorder (FMD).

BM_Icassp2019_Poster.pdf

BM_Icassp2019_Poster.pdf (387)

Categories:: Audio and Acoustic Signal Processing

35 Views

ENCRYPTED SPEECH RECOGNITION USING DEEP POLYNOMIAL NETWORKS

Read more about ENCRYPTED SPEECH RECOGNITION USING DEEP POLYNOMIAL NETWORKS
Log in to post comments

The cloud-based speech recognition/API provides developers or enterprises an easy way to create speech-enabled features in their applications. However, sending audios about personal or company internal information to the cloud, raises concerns about the privacy and security issues. The recognition results generated in cloud may also reveal some sensitive information. This paper proposes a deep polynomial network (DPN) that can be applied to the encrypted speech as an acoustic model. It allows clients to send their data in an encrypted form to the cloud to ensure that their data remains confidential, at mean while the DPN can still make frame-level predictions over the encrypted speech and return them in encrypted form. One good property of the DPN is that it can be trained on unencrypted speech features in the traditional way. To keep the cloud away from the raw audio and recognition results, a cloud-local joint decoding framework is also proposed. We demonstrate the effectiveness of model and framework on the Switchboard and Cortana voice assistant tasks with small performance degradation and latency increased comparing with the traditional cloud-based DNNs.
https://ieeexplore.ieee.org/document/8683721

EncryptASR_slides_V2.pdf

Encrypted_ASR (620)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)
Neural network learning (MLR-NNLR)
Signal Processing and Cryptography

24 Views

Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech

Improving_SER_with_RL.pdf

Improving_SER_with_RL.pdf (553)

Categories:: Speech Analysis (SPE-ANLS)

65 Views

ENTROPY-REGULARIZED OPTIMAL TRANSPORT GENERATIVE MODELS

Read more about ENTROPY-REGULARIZED OPTIMAL TRANSPORT GENERATIVE MODELS
Log in to post comments

We investigate the use of entropy-regularized optimal transport (EOT) cost in developing generative models to learn implicit distributions. Two generative models are proposed. One uses EOT cost directly in an one-shot optimization problem and the other uses EOT cost iteratively in an adversarial game. The proposed generative models show improved performance over contemporary models on scores of sample based test.

poster.pdf

poster.pdf (418)

Categories:: Learning theory and algorithms (MLR-LEAR)

22 Views

Pages