- Read more about RAW WAVEFORM BASED END-TO-END DEEP CONVOLUTIONAL NETWORK FOR SPATIAL LOCALIZATION OF MULTIPLE ACOUSTIC SOURCES
- Log in to post comments
In this paper, we present an end-to-end deep convolutional neural network operating on multi-channel raw audio data to localize multiple simultaneously active acoustic sources in space. Previously reported end-to-end deep learning based approaches work well in localizing a single source directly from multi-channel raw-audio, but are not easily extendable to localize multiple sources due to the well known permutation problem.
- Categories:
- Read more about Analytical Method of 2.5D Exterior Sound Field Synthesis by Using Multipole Loudspeaker Array
- Log in to post comments
We propose an analytical method of 2.5-dimensional exterior sound field reproduction by using a multipole loudspeaker array. The method reproduces the sound field modeled by expansion coefficients of spherical harmonics based on multipole superposition. We also present an analytical method for converting the expansion coefficients of spherical harmonics to weighting coefficients for multipole superposition.
- Categories:
- Read more about 3D localized sound zone generation with a planar omni-directional loudspeaker array
- Log in to post comments
This paper provides a 3D localized sound zone generation method using a planar omni-directional loudspeaker array. In the proposed method, multiple co-centered circular arrays are arranged on the horizontal plane and an additional loudspeaker is located at the array’s center. The sound field produced by this center loudspeaker is then cancelled using the multiple circular arrays. A localized 3D sound zone can thus be generated inside a sphere with a maximum radius of that of the circular arrays because the residual sound field is contained within the sphere.
- Categories:
- Read more about MULTI-GEOMETRY SPATIAL ACOUSTIC MODELING FOR DISTANT SPEECH RECOGNITION
- Log in to post comments
The use of spatial information with multiple microphones can improve far-field automatic speech recognition (ASR) accuracy. However, conventional microphone array techniques degrade speech enhancement performance when there is an array geometry mismatch between design and test conditions. Moreover, such speech enhancement techniques do not always yield ASR accuracy improvement due to the difference between speech enhancement and ASR optimization objectives.
- Categories:
- Read more about FREQUENCY DOMAIN MULTI-CHANNEL ACOUSTIC MODELING FOR DISTANT SPEECH RECOGNITION
- Log in to post comments
Conventional far-field automatic speech recognition (ASR) systems typically employ microphone array techniques for speech enhancement in order to improve robustness against noise or reverberation. However, such speech enhancement techniques do not always yield ASR accuracy improvement because the optimization criterion for speech enhancement is not directly relevant to the ASR objective. In this work, we develop new acoustic modeling techniques that optimize spatial filtering and long short-term memory (LSTM) layers from multi-channel (MC) input based on an ASR criterion directly.
- Categories:
- Read more about Horizontal 3D sound field recording and 2.5D synthesis with omni-directional circular arrays
- Log in to post comments
Although 2.5D sound field synthesis with a circular loudspeaker array can be used in a 3D sound field, a 2D sound field, instead of a 3D sound field, is assumed for a sound field recording with a circular microphone array. This paper presents a horizontal 3D sound field recording and 2.5D synthesis method used in 3D sound fields with multiple co-centered omni-directional circular microphone arrays and a circular loudspeaker array without vertical derivative measurements.
- Categories:
- Read more about Multipath Enabled Private Audio with Noise
- Log in to post comments
We address the problem of privately communicating audio messages to multiple listeners in a reverberant room using a set of loudspeakers. We propose two methods based on emitting noise. In the first method, the loudspeakers emit noise signals that are appropriately filtered so that after echoing along multiple paths in the room, they sum up and descramble to yield distinct meaningful audio messages only at specific focusing spots, while being incoherent everywhere else.
- Categories:
- Read more about Binaural Beamforming Based on Automatic Interferer Selection
- Log in to post comments
Binaural cues are important for sound localization. In addition, spatially separated sound sources are more intelligible than when they are co-located. Binaural cue preservation in multi-microphone hearing assistive devices is therefore important for the user's listening experience and safety.
A number of linearly-constrained-minimum-variance (LCMV) based methods
- Categories:
- Read more about A Subband Energy Modification Method for Elevation Control in Median Plane
- Log in to post comments
Elevation perception is crucial for binaural reproduction. A recent study proposed an elevation control method by modifying the energy of HRTFs in each auditory scale subband, such as the ERB and Mel subband. However, this subband division is designed based on auditory excitation patterns and may not be consistent with the elevation localization cues. To this end, this study proposes a novel subband division strategy which emphasizes the physiological information involved in elevation localization based on a statistical analysis of the HRTF.
- Categories: