Documents
Presentation Slides
Exploiting Vocal Tract Coordination Using Dilated CNNs for Depression Detection in Naturalistic Environments
- Citation Author(s):
- Submitted by:
- Zhaocheng Huang
- Last updated:
- 28 May 2020 - 10:57pm
- Document Type:
- Presentation Slides
- Document Year:
- 2020
- Event:
- Presenters:
- Zhaocheng Huang
- Paper Code:
- SPE-L17.4
- Categories:
- Log in to post comments
Depression detection from speech continues to attract significant research attention but remains a major challenge, particularly when the speech is acquired from diverse smartphones in natural environments. Analysis methods based on vocal tract coordination have shown great promise in depression and cognitive impairment detection for quantifying relationships between features over time through eigenvalues of multi-scale cross-correlations. Motivated by the success of these methods, this paper proposes a novel way to extract full vocal tract coordination (FVTC) features by use of convolutional neural networks (CNNs), overcoming earlier shortcomings. Evaluations of the proposed FVTC-CNN structure on depressed speech data show improvements in mean F1 scores of at least 16.4% under clean conditions and comparable results under noisy conditions relative to existing VTC baseline systems.