Documents
Presentation Slides
SPEAKER DIARIZATION: A PERSPECTIVE ON CHALLENGES AND OPPORTUNITIESFROM THEORY TO PRACTICE
- Citation Author(s):
- Submitted by:
- Kenneth Church
- Last updated:
- 4 March 2017 - 7:07pm
- Document Type:
- Presentation Slides
- Event:
- Presenters:
- Kenneth Church
- Paper Code:
- you14100
- Log in to post comments
This paper discusses some challenges and opportunities in developing a speaker diarization system for operation on real world call center telephony data. We contrast some of the differences between a standard data set akin to NIST evaluations and those found in call centers. In exploring these differences we discovered vulnerabilities and proposed changes to address them.
In moving from theory into practice we introduce two tasks in which speaker diarization and recognition can be leveraged. First, we show that speaker diarization and recognition systems can be integrated to find the common speaker (the call center agent) across multiple calls and consequently their role. Furthermore, once the role is determined the corresponding speech recognition output can be analyzed to determine the type of support call.