Spatial and Multichannel Audio

RAW WAVEFORM BASED END-TO-END DEEP CONVOLUTIONAL NETWORK FOR SPATIAL LOCALIZATION OF MULTIPLE ACOUSTIC SOURCES

In this paper, we present an end-to-end deep convolutional neural network operating on multi-channel raw audio data to localize multiple simultaneously active acoustic sources in space. Previously reported end-to-end deep learning based approaches work well in localizing a single source directly from multi-channel raw-audio, but are not easily extendable to localize multiple sources due to the well known permutation problem.

Sundar_RawMSL.pdf

Raw Waveform based MSL (465)

Categories:: Applications of Sensor Array and Multi-channel Signal Processing
Applications in Music and Audio Processing (MLR-MUSI)
Spatial and Multichannel Audio

42 Views

Analytical Method of 2.5D Exterior Sound Field Synthesis by Using Multipole Loudspeaker Array

We propose an analytical method of 2.5-dimensional exterior sound field reproduction by using a multipole loudspeaker array. The method reproduces the sound field modeled by expansion coefficients of spherical harmonics based on multipole superposition. We also present an analytical method for converting the expansion coefficients of spherical harmonics to weighting coefficients for multipole superposition.

WASPAA2019_Poster_Imaizumi.pdf

WASPAA2019_Poster_Imaizumi.pdf (542)

Categories:: Spatial and Multichannel Audio

72 Views

3D localized sound zone generation with a planar omni-directional loudspeaker array

Read more about 3D localized sound zone generation with a planar omni-directional loudspeaker array
Log in to post comments

This paper provides a 3D localized sound zone generation method using a planar omni-directional loudspeaker array. In the proposed method, multiple co-centered circular arrays are arranged on the horizontal plane and an additional loudspeaker is located at the array’s center. The sound field produced by this center loudspeaker is then cancelled using the multiple circular arrays. A localized 3D sound zone can thus be generated inside a sphere with a maximum radius of that of the circular arrays because the residual sound field is contained within the sphere.

WASPAA_2019_okamoto.pdf

WASPAA_2019_okamoto.pdf (690)

Categories:: Spatial and Multichannel Audio

261 Views

MULTI-GEOMETRY SPATIAL ACOUSTIC MODELING FOR DISTANT SPEECH RECOGNITION

Read more about MULTI-GEOMETRY SPATIAL ACOUSTIC MODELING FOR DISTANT SPEECH RECOGNITION
Log in to post comments

The use of spatial information with multiple microphones can improve far-field automatic speech recognition (ASR) accuracy. However, conventional microphone array techniques degrade speech enhancement performance when there is an array geometry mismatch between design and test conditions. Moreover, such speech enhancement techniques do not always yield ASR accuracy improvement due to the difference between speech enhancement and ASR optimization objectives.

kumatani_poster_icassp2019b.pdf

poster file (471)

template.pdf

manuscript file (466)

Categories:: Spatial and Multichannel Audio
Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)
Robust Speech Recognition (SPE-ROBU)

19 Views

FREQUENCY DOMAIN MULTI-CHANNEL ACOUSTIC MODELING FOR DISTANT SPEECH RECOGNITION

Read more about FREQUENCY DOMAIN MULTI-CHANNEL ACOUSTIC MODELING FOR DISTANT SPEECH RECOGNITION
Log in to post comments

Conventional far-field automatic speech recognition (ASR) systems typically employ microphone array techniques for speech enhancement in order to improve robustness against noise or reverberation. However, such speech enhancement techniques do not always yield ASR accuracy improvement because the optimization criterion for speech enhancement is not directly relevant to the ASR objective. In this work, we develop new acoustic modeling techniques that optimize spatial filtering and long short-term memory (LSTM) layers from multi-channel (MC) input based on an ASR criterion directly.

kumatani_poster_icassp2019a.pdf

poster file (765)

template.pdf

manuscript file (509)

Categories:: Spatial and Multichannel Audio
Robust Speech Recognition (SPE-ROBU)
Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

25 Views

Horizontal 3D sound field recording and 2.5D synthesis with omni-directional circular arrays

Although 2.5D sound field synthesis with a circular loudspeaker array can be used in a 3D sound field, a 2D sound field, instead of a 3D sound field, is assumed for a sound field recording with a circular microphone array. This paper presents a horizontal 3D sound field recording and 2.5D synthesis method used in 3D sound fields with multiple co-centered omni-directional circular microphone arrays and a circular loudspeaker array without vertical derivative measurements.

icassp_2019_okamoto_2.pdf

icassp_2019_okamoto_2.pdf (656)

Categories:: Spatial and Multichannel Audio

203 Views

Multipath Enabled Private Audio with Noise

Read more about Multipath Enabled Private Audio with Noise
Log in to post comments

We address the problem of privately communicating audio messages to multiple listeners in a reverberant room using a set of loudspeakers. We propose two methods based on emitting noise. In the first method, the loudspeakers emit noise signals that are appropriately filtered so that after echoing along multiple paths in the room, they sum up and descramble to yield distinct meaningful audio messages only at specific focusing spots, while being incoherent everywhere else.

icassp_poster_v3.pdf

icassp_poster_v3.pdf (449)

Categories:: Spatial and Multichannel Audio

12 Views

Binaural Beamforming Based on Automatic Interferer Selection

Read more about Binaural Beamforming Based on Automatic Interferer Selection
Log in to post comments

Binaural cues are important for sound localization. In addition, spatially separated sound sources are more intelligible than when they are co-located. Binaural cue preservation in multi-microphone hearing assistive devices is therefore important for the user's listening experience and safety.
A number of linearly-constrained-minimum-variance (LCMV) based methods

ICASSP_poster.pdf

ICASSP_poster.pdf (647)

Categories:: Spatial and Multichannel Audio

19 Views

post

Read more about post
Log in to post comments

電子代轉_T03115216_20190502.pdf

電子代轉_T03115216_20190502.pdf (2338)

Categories:: Spatial and Multichannel Audio

7 Views

A Subband Energy Modification Method for Elevation Control in Median Plane

Read more about A Subband Energy Modification Method for Elevation Control in Median Plane
Log in to post comments

Elevation perception is crucial for binaural reproduction. A recent study proposed an elevation control method by modifying the energy of HRTFs in each auditory scale subband, such as the ERB and Mel subband. However, this subband division is designed based on auditory excitation patterns and may not be consistent with the elevation localization cues. To this end, this study proposes a novel subband division strategy which emphasizes the physiological information involved in elevation localization based on a statistical analysis of the HRTF.

Icassp2019_poster_ydd_e1.pdf

Icassp2019_poster_ydd_e1.pdf (409)

Categories:: Spatial and Multichannel Audio

15 Views

Spatial and Multichannel Audio

Pages