ICASSP 2022

ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

DENOISING-GUIDED DEEP REINFORCEMENT LEARNING FOR SOCIAL RECOMMENDATION

Read more about DENOISING-GUIDED DEEP REINFORCEMENT LEARNING FOR SOCIAL RECOMMENDATION
Log in to post comments

Social recommendation (SR) aims to enhance the performance of recommendations by incorporating social information. However, such information is not always reliable, e.g., some of the friends may share similar preferences with the user on a specific item, while others may be irrelevant to this item due to domain differences. Therefore, modeling all of the user's social relationships without considering the relevance of friends will introduce noises to the social context.

DRL4So.pptx

DRL4So.pptx (248)

Categories:: Machine Learning for Signal Processing

8 Views

DENOISING-ORIENTED DEEP HIERARCHICAL REINFORCEMENT LEARNING FOR NEXT-BASKET RECOMMENDATION

Next basket recommendation aims to provide users a basket of items on the next visit by considering the sequence of their historical baskets. However, since a user's purchase interests vary over time, historical baskets often contain many irrelevant items to his/her next choices. Therefore, it is necessary to denoise the sequence of historical baskets and reserve the indeed relevant items to enhance the recommendation performance.

HRL4Ba.pptx

HRL4Ba.pptx (363)

Categories:: Machine Learning for Signal Processing

14 Views

Extracting and Distilling Direction-adaptive Knowledge for Lightweight Object Ddetection in Remote Sensing Images

Recently, some lightweight convolutional neural network (CNN) models have been proposed for airborne or spaceborne remote sensing object detection (RSOD) tasks. However, these lightweight detectors suffer from performance degradation due to the compromise of limited computing resources on embedded devices. In order to narrow this performance gap, a direction-adaptive knowledge extraction and distillation (DKED) method is proposed.

Poster_Paper8521.pdf

Poster (195)

Categories:: Image/Video Processing

14 Views

Integration of Pre-trained Networks with Continuous Token Interface For End-to-End Spoken Language Understanding

Most End-to-End (E2E) Spoken Language Understanding (SLU) networks leverage the pre-trained Automatic Speech Recognition (ASR) networks but still lack the capability to understand the semantics of utterances, crucial for the SLU task. To solve this, recently proposed studies use pre-trained Natural Language Understanding (NLU) networks. However, it is not trivial to fully utilize both pre-trained networks; many solutions were proposed, such as Knowledge Distillation (KD), cross-modal shared embedding, and network integration with Interface.

icassp_slu_seo_kwak_lee_ver3.pdf

icassp_slu_seo_kwak_lee_ver3.pdf (196)

Categories:: Other

5 Views

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemic Analysis

[oral presentation] TDYCNN.pdf

Oral presentation document (221)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

34 Views

ON THE USE OF GEODESIC TRIANGLES BETWEEN GAUSSIAN DISTRIBUTIONS FOR CLASSIFICATION PROBLEMS

slides.pdf

slides.pdf (229)

Categories:: Statistical Signal Processing

4 Views

ON THE USE OF GEODESIC TRIANGLES BETWEEN GAUSSIAN DISTRIBUTIONS FOR CLASSIFICATION PROBLEMS

poster.pdf

poster.pdf (208)

Categories:: Statistical Signal Processing

3 Views

OPENFEAT: Improving Speaker Identification by Open-set Few-shot Embedding Adaptation with Transformer

Household speaker identification with few enrollment utterances is an important yet challenging problem, especially when household members share similar voice characteristics and room acoustics. A common embedding space learned from a large number of speakers is not universally applicable for the optimal identification of every speaker in a household.

OPENFEAT_ICASSP.pdf

OPENFEAT_ICASSP.pdf (204)

Categories:: Other

18 Views

On the Prediction of the Frequency Response of a Wooden Plate from its Mechanical Parameters

Inspired by deep learning applications in structural mechanics, we focus on how to train two predictors to model the relation between the vibrational response of a prescribed point of a wooden plate and its material properties. In particular, the eigenfrequencies of the plate are estimated via multilinear regression, whereas their amplitude is predicted by a feedforward neural network.

ICASSP_poster.pdf

Poster (336)

ICASSP_presentation_ Badiane.pdf

Slides (217)

On_the_Prediction_of_the_Frequency_Response_of_a_Wooden_Plate_from_Its_Mechanical_Parameters.pdf

Article (214)

Categories:: Other applications of machine learning (MLR-APPL)
Other

35 Views

INVESTIGATION OF ROBUSTNESS OF HUBERT FEATURES FROM DIFFERENT LAYERS TO DOMAIN, ACCENT AND LANGUAGE VARIATIONS

In this paper, we investigate the use of pre-trained HuBERT model to build downstream Automatic Speech Recognition (ASR) models using data that have differences in domain, accent and even language. We use the standard ESPnet recipe with HuBERT as pretrained models whose output is fed as input features to a downstream Conformer model built from target domain data. We compare the performance of HuBERT pre-trained features with the baseline Conformer model built with Mel-filterbank features.

HuBERT presentation.pptx

HuBERT presentation.pptx (266)

Categories:: General Topics in Speech Recognition (SPE-GASR)

64 Views

Pages