Sorry, you need to enable JavaScript to visit this website.

Leveraging LSTM Models for Overlap Detection in Multi-Party Meetings

Citation Author(s):
Neeraj Sajjan, Shobhana Ganesh, Neeraj Sharma, Sriram Ganapathy, Neville Ryant
Submitted by:
Neeraj Sharma
Last updated:
14 April 2018 - 2:54am
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
SRIRAM GANAPATHY
Paper Code:
4021
 

The detection of overlapping speech segments is of key importance in speech applications involving analysis of multi-party conversations. The detection problem is challenging because overlapping speech segments are typically captured as short speech utterances far-field microphone recordings. In this paper, we propose detection of overlap segments using a neural network architecture consisting of long-short term memory (LSTM) models. The neural network architecture learns the presence of overlap in speech by identifying the spectrotemporal structure of overlapping speech segments. In order to evaluate the model performance, we perform experiments on simulated overlapped speech generated from the TIMIT database, and natural multi-talker conversational speech in the augmented multi-party interaction (AMI) meeting corpus. The proposed approach yields improvements over a Gaussian mixture model based overlap detection system. Furthermore, as an application of overlap detection, integration of overlap detection into speaker diarization task is shown to give improvement in diarization error rate.

up
0 users have voted: