Sorry, you need to enable JavaScript to visit this website.

IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

This study explores the use of a multi-feature Bayesian auditory sound localisation model to classify non-individual head-related transfer functions (HRTFs). Based on predicted sound localisation performance, these are grouped into ‘good’ and ‘bad’, and the ‘best’/‘worst’ is selected from each category. Firstly, we present a greedy algorithm for automated individual calibration of the model based on the individual sound localisation data.

Categories:
17 Views

Plug & Play methods combine proximal algorithms with denoiser priors to solve inverse problems. These methods rely on the computability of the proximal operator of the data fidelity term. In this paper, we propose a Plug & Play framework based on linearized ADMM that allows us to bypass the computation of intractable proximal operators. We demonstrate the convergence of the algorithm and provide results on restoration tasks such as super-resolution and deblurring with non-uniform blur.

Categories:
21 Views

Despite there being clear evidence for attentional effects in biological spatial hearing, relatively few machine hearing systems exploit attention in binaural sound localisation. This paper addresses this issue by proposing a novel binaural machine hearing system with temporal attention for robust localisation of sound sources in noisy and reverberant conditions. A convolutional neural network is employed to extract noise-robust localisation features, which are similar to interaural phase difference, directly from phase spectra of the left and right ears for each frame.

Categories:
40 Views

Among the approaches used to jointly estimate the directions of arrival (DOAs) of K targets in a multiple-input multiple-output dual-function radar communication system, the method based on the identification of the K largest local maxima of the modulus of the Fourier transform (FT) of the signal received by the antennas has the advantage of having a low computational cost. However, the local maxima do not necessarily correspond to the values of interest. To avoid this problem, we present an operation mode making it possible to address the estimations of the DOAs separately.

Categories:
21 Views

Among the approaches used to jointly estimate the directions of arrival (DOAs) of K targets in a multiple-input multiple-output dual-function radar communication system, the method based on the identification of the K largest local maxima of the modulus of the Fourier transform (FT) of the signal received by the antennas has the advantage of having a low computational cost. However, the local maxima do not necessarily correspond to the values of interest. To avoid this problem, we present an operation mode making it possible to address the estimations of the DOAs separately.

Categories:
24 Views

We propose a novel method to optimize the structure of factor graphs for graph-based inference. As an example inference task, we consider symbol detection on linear inter-symbol interference channels. The factor graph framework has the potential to yield low-complexity symbol detectors. However, the sum-product algorithm on cyclic factor graphs is suboptimal and its performance is highly sensitive to the underlying graph. Therefore, we optimize the structure of the underlying factor graphs in an end-to-end manner using machine learning.

Categories:
29 Views

This paper describes the design of NNSVS, an open-source software for neural network-based singing voice synthesis research. NNSVS is inspired by Sinsy, an open-source pioneer in singing voice synthesis research, and provides many additional features such as multi-stream models, autoregressive fundamental frequency models, and neural vocoders. Furthermore, NNSVS provides extensive documentation and numerous scripts to build complete singing voice synthesis systems.

Categories:
15 Views

We consider the problem of detecting small and manoeuvring objects with staring array radars. Coherent processing and long-time integration are key to addressing the undesirably low signal-to-noise/background conditions in this scenario and are complicated by the object manoeuvres. We propose a Bayesian solution that builds upon a Bernoulli state space model equipped with the likelihood of the radar data cubes through the radar ambiguity function. Likelihood evaluation in this model corresponds to coherent long-time integration.

Categories:
34 Views

Pages