ICASSP 2021

ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2021 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Pipeline Safety Early Warning Method for Distributed Signal using Bilinear CNN and Lightgbm

Yang_2021_ICASSP_poster.pdf

Yang_2021_ICASSP_poster.pdf (189)

Categories:: Other applications of machine learning (MLR-APPL)

30 Views

Universal Neural Vocoding with Parallel Wavenet

Read more about Universal Neural Vocoding with Parallel Wavenet
Log in to post comments

icassp2021_universal_vocoding_with_pw.pdf

icassp2021_universal_vocoding_with_pw.pdf (215)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

3 Views

Universal Neural Vocoding with Parallel Wavenet

Read more about Universal Neural Vocoding with Parallel Wavenet
Log in to post comments

poster_a0_landscape.pdf

poster_a0_landscape.pdf (336)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

12 Views

ON THE ACCURACY LIMIT OF JOINT TIME-DELAY/DOPPLER/ACCELERATION ESTIMATION WITH A BAND-LIMITED SIGNAL

ICASSP_poster.pdf

ICASSP 2021 Poster (410)

Categories:: Signal and System Modeling, Representation and Estimation

9 Views

Cooperative Scenarios For Multi-agent Reinforcement learning In Wireless Edge Caching

Read more about Cooperative Scenarios For Multi-agent Reinforcement learning In Wireless Edge Caching
Log in to post comments

Wireless edge caching is an important strategy to fulfill the demands in the next generation wireless systems. Recent studies have indicated that among a network of small base stations (SBSs), joint content placement improves the cache hit performance via reinforcement learning, since content requests are correlated across SBSs and files. In this paper, we investigate multi-agent reinforcement learning (MARL), and identify four scenarios for cooperation.

PPT1.pdf

PPT (338)

Categories:: Communications and Networking

40 Views

IMAGE STEGANOGRAPHY BASED ON ITERATIVE ADVERSARIAL PERTURBATIONS ONTO A SYNCHRONIZED-DIRECTIONS SUB-IMAGE

Nowadays a steganography has to face challenges to both feature-based staganalysis and convolutional neural network (CNN) based steganalysis. In this paper, we present a novel steganographic scheme to incorporate synchronizing modification directions and iterative adversarial perturbations to enhance steganographic performance. Firstly an existing steganographic function is employed to compute initial costs. Then the secret message bits are embedded following clustering modification directions profile.

ITE_SYN_Presentation-v1.05.pdf

ITE_SYN_Presentation-v1.05.pdf (236)

Categories:: Watermarking and Steganography

11 Views

Melon Playlist Dataset: A Public Dataset For Audio-based Playlist Generation And Music Tagging

One of the main limitations in the field of audio signal processing is the lack of large public datasets with audio representations and high-quality annotations due to restrictions of copyrighted commercial music. We present Melon Playlist Dataset, a public dataset of mel-spectrograms for 649,091 tracks and 148,826 associated playlists annotated by 30,652 different tags. All the data is gathered from Melon, a popular Korean streaming service. The dataset is suitable for music information retrieval tasks, in particular, auto-tagging and automatic playlist continuation.

icassp2021.pdf

Poster (386)

Categories:: Music Signal Processing
Multimedia Signal Processing

38 Views

Compact Graph Architecture for Speech Emotion Recognition

Read more about Compact Graph Architecture for Speech Emotion Recognition
Log in to post comments

We propose a deep graph approach to address the task of speech emotion recognition. A compact, efficient and scalable way to represent data is in the form of graphs. Following the theory of graph signal processing, we propose to model speech signal as a cycle graph or a line graph. Such graph structure enables us to construct a Graph Convolution Network (GCN)-based architecture that can perform an accurate graph convolution in contrast to the approximate convolution used in standard GCNs.

1623_COMPACT GRAPH ARCHITECTURE FOR SPEECH EMOTION RECOGNITION.pdf

1623_COMPACT GRAPH ARCHITECTURE FOR SPEECH EMOTION RECOGNITION.pdf (357)

Categories:: Content-Based Audio Processing

36 Views

(W)EARABLE MICROPHONE ARRAY AND ULTRASONIC ECHO LOCALIZATION FOR COARSE INDOOR ENVIRONMENT MAPPING

We present a microphone array structure for spherical sound incidence angle tracking that can be attached to headphones or directly integrated into earphones. We show that this microphone array together with an ultrasonic sound source, e.g., a home assistant speaker in the room, allows to estimate the direction and distance of sound reflections on wall surfaces in the room. With our presented method, we achieved sound incidence angle estimation errors of around 14◦

ICASSP_paper1269_Slides.pdf

ICASSP_paper1269_Slides.pdf (175)

ICASSP_paper1269_Poster.pdf

ICASSP_paper1269_Poster.pdf (171)

Categories:: Loudspeaker and Microphone Array Signal Processing

4 Views

Assessment of Bipolar Disorder Using Heterogeneous Data of Smartphone-based Digital Phenotyping

Bipolar Disorder (BD) is one of the most common mental illness. Using rating scales for assessment is one of the approaches for diagnosing and tracking BD patients. However, the requirement for manpower and time is heavy in the process of evaluation. In order to reduce the cost of social and medical resources, this study collects the user’s data by the App on smartphones, consisting of location data, self-report scales, daily mood, sleeping time and records of multi-media which are heterogeneous digital phenotyping data, to build a database.

ICASSP2021_Poster_Evan.pdf

ICASSP2021_Poster_Evan.pdf (278)

Categories:: Multimedia human-machine interface and interaction

7 Views

Pages