Sorry, you need to enable JavaScript to visit this website.

System Combination with Log-linear Models

Citation Author(s):
Jingzhou Yang, Chao Zhang, Anton Ragni, Mark Gales and Phil Woodland
Submitted by:
Jingzhou Yang
Last updated:
24 March 2016 - 8:04am
Document Type:
Poster
Document Year:
2016
Event:
Presenters:
Jingzhou Yang, Chao Zhang, Anton Ragni, Mark Gales and Phil Woodland
Paper Code:
4037
 

Improved speech recognition performance can often be obtained by combining multiple systems
together. Joint decoding, where scores from multiple systems are combined during decoding rather
than combining hypotheses, is one efficient approach for system combination. In standard joint
decoding the frame log-likelihoods from each system are used as the scores. These scores are then
weighted and summed to yield the final score for a frame. The system combination weights for this
process are usually empirically set. In this paper, a recently proposed scheme for learning these
system weights is investigated for a standard noise-robust speech recognition task, AURORA 4. High
performance tandem and hybrid systems for this task are described. By applying state-of-the-art
training approaches and configurations for the bottleneck features of the tandem system, the
difference in performance between the tandem and hybrid systems is significantly smaller than
usually observed on this task. A log-linear model is then used to estimate system weights between
these systems. Training the system weights yields additional gains over empirically set system
weights when used for decoding. Furthermore, when used in a lattice rescoring fashion, further
gains can be obtained.

up
0 users have voted: