Documents
Poster
ROBUST BINAURAL SOUND LOCALISATION WITH TEMPORAL ATTENTION
- DOI:
- 10.60864/rnkw-3722
- Citation Author(s):
- Submitted by:
- Qi Hu
- Last updated:
- 17 November 2023 - 12:07pm
- Document Type:
- Poster
- Document Year:
- 2023
- Event:
- Presenters:
- Ning Ma
- Paper Code:
- 4227
- Categories:
- Log in to post comments
Despite there being clear evidence for attentional effects in biological spatial hearing, relatively few machine hearing systems exploit attention in binaural sound localisation. This paper addresses this issue by proposing a novel binaural machine hearing system with temporal attention for robust localisation of sound sources in noisy and reverberant conditions. A convolutional neural network is employed to extract noise-robust localisation features, which are similar to interaural phase difference, directly from phase spectra of the left and right ears for each frame. A temporal attention layer operates on top of these frame-level features by incorporating outputs of a temporal mask estimation module that indicate target dominance within each frame. The combined features are then exploited by fully connected layers, which map them to the corresponding source azimuth. Both the temporal mask estimation module and the sound localisation module are trained jointly in a multi-task learning manner. Our evaluation shows that the proposed system is able to accurately estimate the azimuth of a sound source in various reverberant and noisy conditions.