IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.
The fusion of multiple probability densities has important applications in many fields, including, for example, multi-sensor signal pro- cessing, robotics, and smart environments. In this paper, we demonstrate that deep learning-based methods can be used to fuse multi-object densities. Given a scenario with several sensors with possibly different field-of-views, tracking is performed locally in each sensor by a tracker, which produces random finite set multi-object densities.
- Categories:
- Read more about Jazznet: A Dataset of Fundamental Piano Patterns for Music Audio Machine Learning Research
- Log in to post comments
The paper introduces the jazznet Dataset, a dataset of fundamental jazz piano music patterns for developing machine learning (ML) algorithms in music information retrieval (MIR). The dataset contains 162520 labeled piano patterns, including chords, arpeggios, scales, and chord progressions with their inversions, resulting in more than 26k hours of audio and a total size of 95GB.
jazznetPoster.pdf
- Categories:
- Read more about The Secret Source : Incorporating Source Features to Improve Acoustic-To-Articulatory Speech Inversion
- Log in to post comments
In this work, we incorporated acoustically derived source features, aperiodicity, periodicity and pitch as additional targets to an acoustic-to-articulatory speech inversion (SI) system. We also propose a Temporal Convolution based SI system, which uses auditory spectrograms as the input speech representation, to learn long-range dependencies and complex interactions between the source and vocal tract, to improve the SI task.
- Categories:
- Read more about In-Band Full-Duplex Solutions in the Paradigm of Integrated Sensing and Communication
- 1 comment
- Log in to post comments
The paper discusses different aspects in favor of using in-band full-duplex frontends for integrated sensing and communication (ISAC), considered for deployment of future 5G/6G infrastructure. Possible scenarios for practical utilization of the technology are discussed with additional focus on self-interference cancellation issue. An possible system implementation on abstract level is presented for cellular communication scenario.
- Categories:
- Read more about Large Dimensional Analysis of LS-SVM Transfer Learning (application on PolSAR)
- Log in to post comments
- Categories:
- Read more about The R3VIVAL Dataset: Repository of room responses and 360 videos of a variable acoustics lab
- Log in to post comments
This paper presents a dataset of spatial room impulse responses (SRIRs) and 360° stereoscopic video captures of a variable acoustics laboratory. A total of 34 source positions are measured with 8 different acoustic panel configurations, resulting in a total of 272 SRIRs. The source positions are arranged in 30° increments at concentric circles of radius 1.5, 2, and 3 m measured with a directional studio monitor, as well as 4 extra positions at the room corners measured with an omnidirectional source.
Poster.pdf
- Categories:
In-car child presence detection (CPD) has gained worldwide attention due to increased child deaths reported yearly when they are left unattended in a car. Existing solutions usually require dedicated sensors and are being surpassed by WiFi-based CPD because the latter can provide broader coverage and can reuse the in-car WiFi devices. However, the existing WiFi-based CPD solutions are not robust and may suffer from miss detection due to the very weak breathing of a young child and high false alarms under unfavorable environmental conditions.
- Categories:
- Read more about Cochlear Decomposition: A Novel Bio-inspired Multiscale Analysis Framework
- Log in to post comments
Signal multiscale decomposition (SMD) is an effective analysis for
the identification of modal information in time-domain signals. So
far, various SMD approaches, such as the Multiresolution Wavelet
Transform (MWT), the Empirical Mode Decomposition (EMD), and
the Variational Mode Decomosition (VMD) have been proposed,
However, issues, such as mode mixing for signals with closelyspaced
modes, have been identified. To confront such problems, we
propose here a novel spatial auditory decomposition framework for
- Categories:
- Read more about PAPER - Real-Time Multichannel Speech Separation And Enhancement Using A Beamspace-Domain-Based Lightweight CNN
- Log in to post comments
The problems of speech separation and enhancement concern the extraction of the speech emitted by a target speaker when placed in a scenario where multiple interfering speakers or noise are present, respectively. A plethora of practical applications such as home assistants and teleconferencing require some sort of speech separation and enhancement pre-processing before applying Automatic Speech Recognition (ASR) systems. In the recent years, most techniques have focused on the application of deep learning to either time-frequency or time-domain representations of the input audio signals.
- Categories:
- Read more about POSTER - Grad-CAM-Inspired Interpretation of Nearfield Acoustic Holography using Physics-Informed Explainable Neural Network
- Log in to post comments
The interpretation and explanation of decision-making processes of neural networks are becoming a key factor in the deep learning field. Although several approaches have been presented for classification problems, the application to regression models needs to be further investigated. In this manuscript we propose a Grad-CAM-inspired approach for the visual explanation of neural network architecture for regression problems.
- Categories: