Sorry, you need to enable JavaScript to visit this website.

ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2021 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Wireless edge caching is an important strategy to fulfill the demands in the next generation wireless systems. Recent studies have indicated that among a network of small base stations (SBSs), joint content placement improves the cache hit performance via reinforcement learning, since content requests are correlated across SBSs and files. In this paper, we investigate multi-agent reinforcement learning (MARL), and identify four scenarios for cooperation.

PPT1.pdf

PDF icon PPT (230)
Categories:
40 Views

Nowadays a steganography has to face challenges to both feature-based staganalysis and convolutional neural network (CNN) based steganalysis. In this paper, we present a novel steganographic scheme to incorporate synchronizing modification directions and iterative adversarial perturbations to enhance steganographic performance. Firstly an existing steganographic function is employed to compute initial costs. Then the secret message bits are embedded following clustering modification directions profile.

Categories:
5 Views

One of the main limitations in the field of audio signal processing is the lack of large public datasets with audio representations and high-quality annotations due to restrictions of copyrighted commercial music. We present Melon Playlist Dataset, a public dataset of mel-spectrograms for 649,091 tracks and 148,826 associated playlists annotated by 30,652 different tags. All the data is gathered from Melon, a popular Korean streaming service. The dataset is suitable for music information retrieval tasks, in particular, auto-tagging and automatic playlist continuation.

Categories:
32 Views

We propose a deep graph approach to address the task of speech emotion recognition. A compact, efficient and scalable way to represent data is in the form of graphs. Following the theory of graph signal processing, we propose to model speech signal as a cycle graph or a line graph. Such graph structure enables us to construct a Graph Convolution Network (GCN)-based architecture that can perform an accurate graph convolution in contrast to the approximate convolution used in standard GCNs.

Categories:
33 Views

We present a microphone array structure for spherical sound incidence angle tracking that can be attached to headphones or directly integrated into earphones. We show that this microphone array together with an ultrasonic sound source, e.g., a home assistant speaker in the room, allows to estimate the direction and distance of sound reflections on wall surfaces in the room. With our presented method, we achieved sound incidence angle estimation errors of around 14◦

Categories:
4 Views

Bipolar Disorder (BD) is one of the most common mental illness. Using rating scales for assessment is one of the approaches for diagnosing and tracking BD patients. However, the requirement for manpower and time is heavy in the process of evaluation. In order to reduce the cost of social and medical resources, this study collects the user’s data by the App on smartphones, consisting of location data, self-report scales, daily mood, sleeping time and records of multi-media which are heterogeneous digital phenotyping data, to build a database.

Categories:
4 Views

Pages