An End-to-End Network to Synthesize Intonation using a Generalized Command Response Model - Poster

The generalized command response (GCR) model represents intonation as a
superposition of muscle responses to spike command signals. We have previously
shown that the spikes can be predicted by a two-stage system, consisting of a recurrent neural network and a post-processing procedure, but the responses themselves were fixed dictionary atoms. We propose an end-to-end
neural architecture that replaces the dictionary atoms with trainable
second-order recurrent elements analogous to recursive filters. We demonstrate
gradient stability under modest conditions, and show that the system can be
trained by imposing temporal sparsity constraints. Subjective listening tests
demonstrate that the system can synthesize intonation with high naturalness,
comparable to state-of-the-art acoustic models, and retains the physiological
plausibility of the GCR model.

An End-to-End Network to Synthesize Intonation using a Generalized Command Response Model Poster.pdf

Presentation Poster (495)

Thumbs Up

CITE

Documents

Poster

An End-to-End Network to Synthesize Intonation using a Generalized Command Response Model - Poster

An End-to-End Network to Synthesize Intonation using a Generalized Command Response Model Poster.pdf

QUESTIONS?