ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.
- Read more about LOW-COMPLEXITY DETECTION AND PERFORMANCE ANALYSIS FOR DECODE-AND-FORWARD RELAY NETWORKS
- Log in to post comments
We re-examine the problem of designing low-complexity detectors and their performance analysis for the 3-node oneway decode-and-forward (DF) relay network, where the destination has the statistical channel state information (CSI) of the source-relay link, and the instantaneous CSI of both the source-destination and relay-destination links. Recently proposed detection schemes, such as the piece-wise linear detector (PLD), achieve near-optimal error performance with linear complexity (with respect to the modulation size).
- Categories:
- Read more about F0 CONTOUR ESTIMATION USING PHONETIC FEATURE IN ELECTROLARYNGEAL SPEECH ENHANCEMENT
- Log in to post comments
Pitch plays a significant role in understanding a tone based language like Mandarin. In this paper, we present a new method that estimates F0 contour for electrolaryngeal (EL) speech enhancement in Mandarin. Our system explores the usage of phonetic feature to improve the quality of EL speech. First, we train an acoustic model for EL speech and generate the phoneme posterior probabilities feature sequence for each input EL speech utterance. Then we employ the phonetic feature for F0 contour generation rather than the acoustic feature.
- Categories:
- Read more about A CASCADE OF CNN AND LSTM NETWORK WITH 3D ANCHORS FOR MITOTIC CELL DETECTION IN 4D MICROSCOPIC IMAGE
- Log in to post comments
Mitotic event detection is a fundamental step in investigating of cell behaviors. The event can be used to analyze various diseases, but most mitotic event detections performed previously focused only on two-dimensional (2D) images with time information. Owing to the complex background (normal cells) and mitotic event orientations, the 2D detection methods yield many false positive and false negative results. To solve this problem, we proposed a 2.5 dimensional (2.5D) cascaded end-to-end network combined with 3D anchors for accurate detection of mitotic events in 4D microscopic images.
ICASSP2019.pdf
- Categories:
- Read more about Improving Facial Attractiveness Prediction via Co-Attention Learning
- Log in to post comments
Facial attractiveness prediction has drawn considerable attention from image processing community.
Despite the substantial progress achieved by existing works, various challenges remain.
One is the lack of accurate representation for facial composition, which is essential for attractiveness evaluation. In this paper, we propose to use pixel-wise labelling masks as the meta information of facial composition, and input them into a network for learning high-level semantic representations.
- Categories:
- Read more about SPEAKER CHARACTERIZATION USING TDNN-LSTM BASED SPEAKER EMBEDDING
- Log in to post comments
In this paper we propose speaker characterization using time delay neural networks and long short-term memory neural networks (TDNN-LSTM) speaker embedding. Three types of front-end feature extraction are investigated to find good features for speaker embedding. Three kinds of data augmentation are used to increase the amount and diversity of the training data. The proposed methods are evaluated with the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) tasks.
- Categories:
- Read more about Speaker-dependent WaveNet-based delay-free ADPCM speech coding
- Log in to post comments
This paper proposes a WaveNet-based delay-free adaptive differential pulse code modulation (ADPCM) speech coding system. The WaveNet generative model, which is a stateof-the-art model for neural-network-based speech waveform synthesis, is used as the adaptive predictor in ADPCM. To further improve speech quality, mel-cepstrum-based noise shaping and postfiltering were integrated with the proposed ADPCM system.
poster.pdf
- Categories:
In this work, we tackle this problem by firstly proposing CCTV-Fights, a novel and challenging dataset containing 1,000 videos of real fights, with more than 8 hours of annotated CCTV footage. Then we propose a pipeline, on which we assess the impact of different feature extractors, as well as different classifiers.
- Categories:
- Read more about SEQUENTIAL MATCHING MODEL FOR END-TO-END MULTI-TURN RESPONSE SELECTION
- Log in to post comments
- Categories:
- Read more about ALGEBRAICALLY-INITIALIZED EXPECTATION MAXIMIZATION FOR HEADER-FREE COMMUNICATION
- Log in to post comments
- Categories: