ICASSP 2021

ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2021 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

CoughWatch: Real-World Cough Detection Using Smartwatches

Read more about CoughWatch: Real-World Cough Detection Using Smartwatches
Log in to post comments

CoughWatch ICASSP 21.pdf

CoughWatch ICASSP 21.pdf (469)

Categories:: Other applications of machine learning (MLR-APPL)
Applications in Data Fusion (MLR-FUSI)

33 Views

VOWEL NON-VOWEL BASED SPECTRAL WARPING AND TIME SCALE MODIFICATION FOR IMPROVEMENT IN CHILDREN’S ASR

Acoustic differences between children’s and adults’ speech causes the degradation in the automatic speech recognition system performance when system trained on adults’ speech and tested on children’s speech. The key acoustic mismatch factors are formant, speaking rate, and pitch. In this paper, we proposed a linear prediction based spectral warping method by using the knowledge of vowel and non-vowel regions in speech signals to mitigate the formant frequencies differences between child and adult speakers.

ICASSP_21_paperid-3362_ppts.pdf

ICASSP_21_paperid-3362_ppts.pdf (343)

Categories:: Audio and Acoustic Signal Processing

21 Views

DON’T LOOK BACK: AN ONLINE BEAT TRACKING METHOD USING RNN AND ENHANCED PARTICLE FILTERING

Online beat tracking (OBT) has always been a challenging task. Due to the inaccessibility of future data and the need to make inference in real-time. We propose Don’t Look back! (DLB), a novel approach optimized for efficiency when performing OBT. DLB feeds the activations of a unidirectional RNN into an enhanced Monte-Carlo localization model to infer beat positions. Most preexisting OBT methods either apply some offline approaches to a moving window containing past data to make predictions about future beat positions or must be primed with past data at startup to initialize.

DONT LOOK BACK _Mojtaba Heydari_Zhiyao Duan_ICASSP 2021.pdf

DONT LOOK BACK _Mojtaba Heydari_Zhiyao Duan_ICASSP 2021.pdf (295)

Categories:: Music Signal Processing

25 Views

Statistical Properties of a Modified Welch Method That Uses Sample Percentiles

Read more about Statistical Properties of a Modified Welch Method That Uses Sample Percentiles
Log in to post comments

We present and analyze an alternative, more robust approach to the Welch’s overlapped segment averaging (WOSA) spectral estimator. Our method computes sample percentiles instead of averaging over multiple periodograms to estimate power spectral densities (PSDs). Bias and variance of the proposed estimator are derived for varying sample sizes and arbitrary percentiles. We have found excellent agreement between our expressions and data sampled from a white Gaussian noise process.

wp_presentation.pdf

presentation (268)

Categories:: Statistical Signal Processing

7 Views

Statistical Properties of a Modified Welch Method That Uses Sample Percentiles

Read more about Statistical Properties of a Modified Welch Method That Uses Sample Percentiles
Log in to post comments

wp_poster.pdf

poster (265)

Categories:: Statistical Signal Processing

7 Views

Predictive Coding For Lossless Dataset Compression

Read more about Predictive Coding For Lossless Dataset Compression
Log in to post comments

Lossless compression of datasets is a problem of significant theoretical and practical interest. It appears naturally in the task of storing, sending, or archiving large collections of information for scientific research. We can greatly improve encoding bitrate if we allow the compression of the original dataset to decompress to a permutation of the data. We prove the equivalence of dataset compression to compressing a permutation-invariant structure of the data and implement such a scheme via predictive coding.

Dataset Compression ICASSP poster.pdf

Dataset Compression ICASSP poster.pdf (320)

Categories:: Image/Video Coding

8 Views

CASCADED ALL-PASS FILTERS WITH RANDOMIZED CENTER FREQUENCIES AND PHASE POLARITY FOR ACOUSTIC AND SPEECH MEASUREMENT AND DATA AUGMENTATION

A new Time-Stretched-Pulse provides a solid foundation of acoustic measurements
Motivation: Measure and record speech data acquisition and presentation conditions
Issues: The target (real-world) systems consist of not only linear time-invariant but also non-linear time-invariant, random, and time-varying responses
Solution: We invented a simultaneous measurement of multiple paths by combining extended TSP signals with binary orthogonal weight sequences

presentationSlidesr.pdf

ICASSP2021 Presentation slides on acoustic measurement method (347)

Categories:: Auditory Modeling and Hearing Aids

26 Views

CASCADED ALL-PASS FILTERS WITH RANDOMIZED CENTER FREQUENCIES AND PHASE POLARITY FOR ACOUSTIC AND SPEECH MEASUREMENT AND DATA AUGMENTATION

kawaharaPostergraffleR.pdf

ICASSP2021 Poster presentation on acoustic measurement method (359)

Categories:: Auditory Modeling and Hearing Aids

10 Views

A BIAS-REDUCING LOSS FUNCTION FOR CT IMAGE DENOISING

Read more about A BIAS-REDUCING LOSS FUNCTION FOR CT IMAGE DENOISING
Log in to post comments

There is growing interest in the use of deep neural network
(DNN) based image denoising to reduce patient’s X-ray
dosage in medical computed tomography (CT). An effective
denoiser must remove noise while maintaining the texture
and detail. Commonly used mean squared error (MSE) loss
functions in the DNN training weight errors due to bias and
variance equally. However, the error due to bias is often more
egregious since it results in loss of image texture and detail.
In this paper, we present a novel approach to designing a loss

ICASSP_poster_v6.pdf

ICASSP_poster_v6.pdf (318)

Categories:: Medical imaging
Other applications of machine learning (MLR-APPL)

58 Views

MULTI-GRANULARITY FEATURE INTERACTION AND RELATION REASONING FOR 3D DENSE ALIGNMENT AND FACE RECONSTRUCTION

In this paper, we propose a multi-granularity feature interaction and relation reasoning network (MFIRRN) which can recover a detail-rich 3D face and perform more accurate dense alignment in an unconstrained environment. Traditional 3DMM-based methods directly regress parameters, resulting in the lack of fine-grained details in the reconstruction 3D face. To this end, we use different branches to capture discriminative features at different granularities, especially local features at medium and fine granularities.

ICASSP-poster.pdf

ICASSP-poster.pdf (505)

Categories:: Image, Video, and Multidimensional Signal Processing

20 Views

Pages