Sorry, you need to enable JavaScript to visit this website.

SPEAKER AGNOSTIC FOREGROUND SPEECH DETECTION FROM AUDIO RECORDINGS 
IN WORKPLACE SETTINGS FROM WEARABLE RECORDERS


Citation Author(s):
Amrutha Nadarajan, Krishna Somandepalli, Shrikanth S. Narayanan
Submitted by:
Amrutha Nadarajan
Last updated:
9 May 2019 - 12:29am
Document Type:
Poster
Document Year:
2019
Event:
Presenters:
Amrutha Nadarajan
Paper Code:
4834
 

Audio-signal acquisition as part of wearable sensing adds an important dimension for applications such as understanding human behaviors. As part of a large study on work place behaviors, we collected audio data from individual hospital staff using custom wearable recorders. The audio features collected were limited to preserve privacy of the interactions in the hospital. A first step towards audio processing is to identify the foreground speech of the person wearing the audio badge. This task is challenging because of the multi-party nature of possible ambulatory interactions, lack of access to speaker information and varying channel and ambient conditions. In this paper, we present a speaker-agnostic approach to foreground detection. We propose a convolutional neural network model to predict foreground regions using a limited set of audio features. We show that these models generalize across the proxy corpora we collected in house to approximately match the deployment environment. The proxy corpora contained full audio and was used as a test-bed to analyze our models in greater detail. We also evaluated the models in the workplace setting to measure speech activity. Our experimental results show promising direction for analyzing workplace behaviors with privacy protected sensing.

link to paper: https://ieeexplore.ieee.org/document/8683244

up
0 users have voted: