Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website

We re-examine the problem of designing low-complexity detectors and their performance analysis for the 3-node oneway decode-and-forward (DF) relay network, where the destination has the statistical channel state information (CSI) of the source-relay link, and the instantaneous CSI of both the source-destination and relay-destination links. Recently proposed detection schemes, such as the piece-wise linear detector (PLD), achieve near-optimal error performance with linear complexity (with respect to the modulation size).

Categories:
10 Views

Pitch plays a significant role in understanding a tone based language like Mandarin. In this paper, we present a new method that estimates F0 contour for electrolaryngeal (EL) speech enhancement in Mandarin. Our system explores the usage of phonetic feature to improve the quality of EL speech. First, we train an acoustic model for EL speech and generate the phoneme posterior probabilities feature sequence for each input EL speech utterance. Then we employ the phonetic feature for F0 contour generation rather than the acoustic feature.

Categories:
15 Views

Mitotic event detection is a fundamental step in investigating of cell behaviors. The event can be used to analyze various diseases, but most mitotic event detections performed previously focused only on two-dimensional (2D) images with time information. Owing to the complex background (normal cells) and mitotic event orientations, the 2D detection methods yield many false positive and false negative results. To solve this problem, we proposed a 2.5 dimensional (2.5D) cascaded end-to-end network combined with 3D anchors for accurate detection of mitotic events in 4D microscopic images.

Categories:
7 Views

Facial attractiveness prediction has drawn considerable attention from image processing community.
Despite the substantial progress achieved by existing works, various challenges remain.
One is the lack of accurate representation for facial composition, which is essential for attractiveness evaluation. In this paper, we propose to use pixel-wise labelling masks as the meta information of facial composition, and input them into a network for learning high-level semantic representations.

Categories:
17 Views

In this paper we propose speaker characterization using time delay neural networks and long short-term memory neural networks (TDNN-LSTM) speaker embedding. Three types of front-end feature extraction are investigated to find good features for speaker embedding. Three kinds of data augmentation are used to increase the amount and diversity of the training data. The proposed methods are evaluated with the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) tasks.

Categories:
94 Views

This paper proposes a WaveNet-based delay-free adaptive differential pulse code modulation (ADPCM) speech coding system. The WaveNet generative model, which is a stateof-the-art model for neural-network-based speech waveform synthesis, is used as the adaptive predictor in ADPCM. To further improve speech quality, mel-cepstrum-based noise shaping and postfiltering were integrated with the proposed ADPCM system.

Categories:
29 Views

In this work, we tackle this problem by firstly proposing CCTV-Fights, a novel and challenging dataset containing 1,000 videos of real fights, with more than 8 hours of annotated CCTV footage. Then we propose a pipeline, on which we assess the impact of different feature extractors, as well as different classifiers.

Categories:
44 Views

Pages