Multi-channel Signal Processing

JOINTLY LEARNING SELECTION MATRICES FOR TRANSMITTERS, RECEIVERS AND FOURIER COEFFICIENTS IN MULTICHANNEL IMAGING

Strategic subsampling has become a focal point due to its effectiveness in compressing data, particularly in the Full Matrix Capture (FMC) approach in ultrasonic imaging. This paper introduces the Joint Deep Probabilistic Subsampling (J-DPS) method, which aims to learn optimal selection matrices simultaneously for transmitters, receivers, and Fourier coefficients. This task-based algorithm is realized by introducing a specialized measurement model and integrating a customized Complex Learned FISTA (CL-FISTA) network.

ICASSP2024_Presentation_J-DPS.pptx

ICASSP2024_Presentation_J-DPS.pptx (277)

Categories:: Multi-channel Signal Processing
Other applications of machine learning (MLR-APPL)

20 Views

CST-FORMER: TRANSFORMER WITH CHANNEL-SPECTRO-TEMPORAL ATTENTION FOR SOUND EVENT LOCALIZATION AND DETECTION

Sound event localization and detection (SELD) is a task for the classification of sound events and the localization of direction of arrival (DoA) utilizing multichannel acoustic signals. Prior studies employ spectral and channel information as the embedding for temporal attention. However, this usage limits the deep neural network from extracting meaningful features from the spectral or spatial domains.

ICASSP_6838_poster_yusunshul_final.pdf

Poster material (250)

Categories:: Multi-channel Signal Processing

53 Views

All Neural Kronecker Product Beamforming for Speech Extraction with Large-scale Microphone Arrays

Existing frame-wise neural beamformers for speech extraction tasks can obtain promising performance in relatively high signal-to-noise ratio (SNR) scenarios using small microphone arrays, while they still suffer from performance degradation in relatively low SNR environments, e.g., SNR<-5 dB. As an attempt to solve this problem, this paper proposes an all-neural beamformer based on Kronecker product decomposition, denoted by NeuKP-BF, for large-scale microphone arrays.

Poster_ICASSP2024_All Neural Kronecker Product Beamforming for Speech Extraction with Large-scale Microphone Arrays.pdf

Poster (240)

Categories:: Multi-channel Signal Processing

13 Views

SpatialCodec: Neural Spatial Speech Coding

Read more about SpatialCodec: Neural Spatial Speech Coding
Log in to post comments

In this work, we address the challenge of encoding speech captured by a microphone array using deep learning techniques with the aim of preserving and accurately reconstructing crucial spatial cues embedded in multi-channel recordings. We propose a neural spatial audio coding framework that achieves a high compression ratio, leveraging single-channel neural sub-band codec and SpatialCodec.

SpatialCodec_Poster.pptx

SpatialCodec_Poster.pptx (289)

Categories:: Multi-channel Signal Processing
Spatial and Multichannel Audio
Speech Coding (SPE-CODI)

78 Views

Optimized Coded Aperture Design in Compressive Spectral Imaging via Coherence Minimization

The coded aperture snapshot spectral imager (CASSI) system senses spatial and spectral information using a binary coded aperture and a dispersive element, thus the quality of reconstructed hyperspectral images is mainly determined by the structure of coded apertures. Traditional coded apertures (Random, Bernoulli, etc.), encoding hyperspectral images in focal array plane, suffer from suboptimal reconstruction accuracy. Therefore, optimizing coded aperture design improves the reconstruction quality for the scene.

Presentation_Optimized Coded Apertures.pptx

Presentation_Optimized Coded Apertures.pptx (339)

Categories:: Sampling and Reconstruction
Image/Video Coding
Multi-channel Signal Processing

114 Views

Identification of Overlapping Echoes of Unknown Shape from Time-Encoding Machine Samples.

We present an algorithm for the resolution of delayed and overlapping pulses of a common unknown shape from multi- channel measurements. We show that just a few Fourier sam- ples acquired by a Time Encoding Machine (TEM) suffice to solve this challenging problem. This acquisition scheme is desired for ultra-low power applications in wearables, such as EMG skin sensor tattoo.