Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website

Although beamforming optimization problems in full-duplex communication systems can be optimally solved with the semidefinite relaxation (SDR) approach, its computational complexity increases rapidly when the problem size increases. In order to circumvent this issue, in this paper, we propose an alternating direction of multiplier method (ADMM) which minimizes the augmented Lagrangian of the dual of the SDR and handles the inequality constraints with the use of slack variables. The proposed ADMM is then applied for optimizing the relay beamformer to maximize the secrecy rate.

Categories:
40 Views

Immediate and accurate detection of wildfire is essentially important in forest monitoring systems
•One of the most harmful hazards in rural areas
•For wildfire detection, the use of visible-range video captured by surveillance cameras are suitable
•They can be deployed and operated in a cost-effective manner
•The challenge is to provide a robust detection system with negligible false positive rates
•If the flames are visible, they can be detected by analyzing the motion and color clues of a video

Categories:
62 Views

Automatic melody generation has been a long-time aspiration for both AI researchers and musicians. However, learning to generate euphonious melodies has turned out to be highly challenging. This paper introduces 1) a new variant of variational autoencoder (VAE), where the model structure is designed in a modularized manner in order to model polyphonic and dynamic music with domain knowledge, and 2) a hierarchical encoding/decoding strategy, which explicitly models the dependency between melodic features.

Categories:
21 Views

This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for the single-channel source separation task. Recurrent Neural Networks (RNN) require several sets of weights within its cells, which significantly increases the computational cost compared to the fully-connected networks. To mitigate this increased computation, we focus on the GRU cells and quantize the feedforward procedure with binarized values and bitwise operations. The BGRU network is trained in two stages.

Categories:
12 Views

Image a robot is shown new concepts visually together with spoken tags, e.g. "milk", "eggs", "butter". After seeing one paired audiovisual example per class, it is shown a new set of unseen instances of these objects, and asked to pick the "milk". Without receiving any hard labels, could it learn to match the new continuous speech input to the correct visual instance? Although unimodal one-shot learning has been studied, where one labelled example in a single modality is given per class, this example motivates multimodal one-shot learning.

Categories:
8 Views

The use of spatial information with multiple microphones can improve far-field automatic speech recognition (ASR) accuracy. However, conventional microphone array techniques degrade speech enhancement performance when there is an array geometry mismatch between design and test conditions. Moreover, such speech enhancement techniques do not always yield ASR accuracy improvement due to the difference between speech enhancement and ASR optimization objectives.

Categories:
16 Views

Conventional far-field automatic speech recognition (ASR) systems typically employ microphone array techniques for speech enhancement in order to improve robustness against noise or reverberation. However, such speech enhancement techniques do not always yield ASR accuracy improvement because the optimization criterion for speech enhancement is not directly relevant to the ASR objective. In this work, we develop new acoustic modeling techniques that optimize spatial filtering and long short-term memory (LSTM) layers from multi-channel (MC) input based on an ASR criterion directly.

Categories:
23 Views

Pages