Sorry, you need to enable JavaScript to visit this website.

Automatic identification of speakers from head gestures in a narration

Primary tabs

Citation Author(s):
Sanjeev Kadagathur Vadiraj, Achuth Rao M V, Prasanta Kumar Ghosh
Submitted by:
Last updated:
16 May 2020 - 10:52am
Document Type:
Presentation Slides
Document Year:
Presenters Name:
Sanjeev Kadagathur Vadiraj



In this work, we focus on quantifying speaker identity information encoded in the head gestures of speakers, while they narrate a story. We hypothesize that the head gestures over a long duration have speaker-specific patterns. To establish this, we consider a classification problem to identify speakers from head gestures. We represent every head orientation as a triplet of Euler angles and a sequence of head orientations as head gestures. We use a database having recordings from 24 speakers where the head movements are recorded using a motion capture device, with each subject narrating ten stories. We get the best speaker identification accuracy of 0.836 using head gestures over a duration of 40 seconds. Further, the accuracy increases by combining decisions from multiple 40 second windows when a recording is available with duration more than the window length. We achieve an average accuracy of 0.9875 on our database when the entire recording is used. Analysis of the speaker identification performance over 40 second windows across a recording reveals that the speaker-identity information is more prevalent in some parts of a story than others.

0 users have voted:

Dataset Files