ICASSP 2021

ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2021 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Ensemble combination between different time segmentations

Read more about Ensemble combination between different time segmentations
Log in to post comments

Hypothesis-level combination between multiple models can often yield gains in speech recognition. However, all models in the ensemble are usually restricted to use the same audio segmentation times. This paper proposes to generalise hypothesis-level combination, allowing the use of different audio segmentation times between the models, by splitting and re-joining the hypothesised N-best lists in time. A hypothesis tree method is also proposed to distribute hypothesis posteriors among the constituent words, to facilitate such splitting when per-word scores are not available.

ICASSP_2021___Multi_pass_combination___poster.pdf

ICASSP_2021___Multi_pass_combination___poster.pdf (262)

Categories:: General Topics in Speech Recognition (SPE-GASR)

7 Views

SEMI-SUPERVISED SKIN LESION SEGMENTATION WITH LEARNING MODEL CONFIDENCE

Read more about SEMI-SUPERVISED SKIN LESION SEGMENTATION WITH LEARNING MODEL CONFIDENCE
Log in to post comments

poster-xzq.pdf

poster-xzq.pdf (242)

Categories:: Medical image analysis

3 Views

Independent Vector Analysis using Semi-Parametric Density Estimation via Multivariate Entropy Maximization

Due to the wide use of multi-sensor technology, analysis of multiple sets of data is at the heart of many challenging engineering problems. Independent vector analysis (IVA), a recent generalization of independent component analysis (ICA), enables the joint analysis of datasets and extraction of latent sources through the use of a simple yet effective generative model. However, the success of IVA is tied to proper estimation of the probability density function (PDF) of the multivariate latent sources; information that is generally unknown.

Damasceno_Slides.pdf

Damasceno_Slides.pdf (349)

Damasceno_Poster.pdf

Damasceno_Poster.pdf (316)

Categories:: Independent component analysis (MLR-ICAN)

96 Views

CAMP: A Two-Stage Approach To Modelling Prosody In Context

Read more about CAMP: A Two-Stage Approach To Modelling Prosody In Context
Log in to post comments

Prosody is an integral part of communication, but remains an open problem in state-of-the-art speech synthesis. There are two major issues faced when modelling prosody: (1) prosody varies at a slower rate compared with other content in the acoustic signal (e.g. segmental information and background noise); (2) determining appropriate prosody without sufficient context is an ill-posed problem. In this paper, we propose solutions to both these issues. To mitigate the challenge of modelling a slow-varying signal, we learn to disentangle prosodic information using a word level representation.

poster.pdf

poster.pdf (281)

slides.pdf

slides.pdf (254)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

13 Views

SIG2SIG : SIGNAL TRANSLATION NETWORKS TO TAKE THE REMAINS OF THE PAST

Read more about SIG2SIG : SIGNAL TRANSLATION NETWORKS TO TAKE THE REMAINS OF THE PAST
Log in to post comments

ICASSP2021_sig2sig_poster.pdf

ICASSP2021_sig2sig_poster.pdf (321)

Categories:: Other applications of machine learning (MLR-APPL)
Machine Learning for Signal Processing

17 Views

Radar Clutter Classification Using Expectation-Maximization Method

Read more about Radar Clutter Classification Using Expectation-Maximization Method
Log in to post comments

poster.pdf

the poster of the paper (331)

Categories:: Multi-channel Signal Processing

14 Views

Amplitude Matching: Majorization-Minimization Algorithm For Sound Field Control Only With Amplitude Constraint

ICASSP2021.pdf

ICASSP2021.pdf (555)

Categories:: Loudspeaker and Microphone Array Signal Processing

84 Views

DEEPF0: END-TO-END FUNDAMENTAL FREQUENCY ESTIMATION FOR MUSIC AND SPEECH SIGNALS

Read more about DEEPF0: END-TO-END FUNDAMENTAL FREQUENCY ESTIMATION FOR MUSIC AND SPEECH SIGNALS
Log in to post comments

We propose a novel pitch estimation technique called DeepF0, which leverages the available annotated data to directly learns from the raw audio in a data-driven manner. F0 estimation is important in various speech processing and music information retrieval applications. Existing deep learning models for pitch estimations have relatively limited learning capabilities due to their shallow receptive field. The proposed model addresses this issue by extending the receptive field of a network by introducing the dilated convolutional blocks into the network.

ICASSP_2021_Poster-final-version.pdf

ICASSP 2021 Poster (317)

End-to-end Pitch estimation using Deep Learning-v2-without-audio.pdf

ICASSP 2021 Presentation (369)

Categories:: Music Signal Processing
Audio Processing Systems

124 Views

Differential Convolution Feature Guided Deep Multi-Scale Multiple Instance Learning for Aerial Scene Classification

Aerial image classification is challenging for current deep learning models due to the varied geo-spatial object scales and the complicated scene spatial arrangement. Thus, it is necessary to stress the key local feature response from a variety of scales so as to represent discriminative convolutional features. In this paper, we propose a deep multi-scale multiple instance learning (DMSMIL) framework to tackle the above challenges. Firstly, we develop a differential multi-scale dilated convolution feature extractor to exploit the different patterns from different scales.

poster.pdf

poster.pdf (285)

Categories:: Pattern recognition and classification (MLR-PATT)

19 Views

Categories:: Other

10 Views

Ensemble combination between different time segmentations

ICASSP_2021___Multi_pass_combination___poster.pdf

SEMI-SUPERVISED SKIN LESION SEGMENTATION WITH LEARNING MODEL CONFIDENCE

poster-xzq.pdf

Independent Vector Analysis using Semi-Parametric Density Estimation via Multivariate Entropy Maximization

Damasceno_Slides.pdf

Damasceno_Poster.pdf

CAMP: A Two-Stage Approach To Modelling Prosody In Context

poster.pdf

slides.pdf

SIG2SIG : SIGNAL TRANSLATION NETWORKS TO TAKE THE REMAINS OF THE PAST

ICASSP2021_sig2sig_poster.pdf

Radar Clutter Classification Using Expectation-Maximization Method

poster.pdf

Amplitude Matching: Majorization-Minimization Algorithm For Sound Field Control Only With Amplitude Constraint

ICASSP2021.pdf

DEEPF0: END-TO-END FUNDAMENTAL FREQUENCY ESTIMATION FOR MUSIC AND SPEECH SIGNALS

ICASSP_2021_Poster-final-version.pdf

End-to-end Pitch estimation using Deep Learning-v2-without-audio.pdf

Differential Convolution Feature Guided Deep Multi-Scale Multiple Instance Learning for Aerial Scene Classification

poster.pdf

Deep Neural Network based Cough Detection using Bed-mounted Accelerometer Measurements

ICASSP 2021_poster.pdf

ICASSP 2021_presentation.pdf

Pages