Sorry, you need to enable JavaScript to visit this website.

Robust Speech Recognition (SPE-ROBU)

SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION


In this paper, we present an algorithm called Reliable Mask Selection-Phase Difference Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source using the Angle of Arrival (AoA) information calculated using the phase difference information. The RMS-PDCW algorithm selects masks to apply using the information about the localized sound source and the onset detection of speech.

Paper Details

Authors:
Chanwoo Kim, Anjali Menon, Michiel Bacchiani , Richard Stern
Submitted On:
7 May 2018 - 12:38am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

icassp_4465_poster.pdf

(47 downloads)

Keywords

Subscribe

[1] Chanwoo Kim, Anjali Menon, Michiel Bacchiani , Richard Stern, "SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3203. Accessed: Jun. 22, 2018.
@article{3203-18,
url = {http://sigport.org/3203},
author = {Chanwoo Kim; Anjali Menon; Michiel Bacchiani ; Richard Stern },
publisher = {IEEE SigPort},
title = {SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION},
year = {2018} }
TY - EJOUR
T1 - SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION
AU - Chanwoo Kim; Anjali Menon; Michiel Bacchiani ; Richard Stern
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3203
ER -
Chanwoo Kim, Anjali Menon, Michiel Bacchiani , Richard Stern. (2018). SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION. IEEE SigPort. http://sigport.org/3203
Chanwoo Kim, Anjali Menon, Michiel Bacchiani , Richard Stern, 2018. SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION. Available at: http://sigport.org/3203.
Chanwoo Kim, Anjali Menon, Michiel Bacchiani , Richard Stern. (2018). "SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION." Web.
1. Chanwoo Kim, Anjali Menon, Michiel Bacchiani , Richard Stern. SOUND SOURCE SEPARATION USING PHASE DIFFERENCE AND RELIABLE MASK SELECTION SELECTION [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3203

SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION


In this paper, we present an algorithm which introduces phase-perturbation to the training database when training phase-sensitive deep neural-network models. Traditional features such as log-mel or cepstral features do not have have any phase-relevant information.However features such as raw-waveform or complex spectra features contain phase-relevant information. Phase-sensitive features have the advantage of being able to detect differences in time of

Paper Details

Authors:
Chanwoo Kim, Tara Sainath, Arun Narayanan, Ananya Misra, Rajeev Nongpiur, Michiel Bacchiani
Submitted On:
7 May 2018 - 12:19am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp_4404_poster.pdf

(38 downloads)

Keywords

Subscribe

[1] Chanwoo Kim, Tara Sainath, Arun Narayanan, Ananya Misra, Rajeev Nongpiur, Michiel Bacchiani, "SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3202. Accessed: Jun. 22, 2018.
@article{3202-18,
url = {http://sigport.org/3202},
author = {Chanwoo Kim; Tara Sainath; Arun Narayanan; Ananya Misra; Rajeev Nongpiur; Michiel Bacchiani },
publisher = {IEEE SigPort},
title = {SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION},
year = {2018} }
TY - EJOUR
T1 - SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION
AU - Chanwoo Kim; Tara Sainath; Arun Narayanan; Ananya Misra; Rajeev Nongpiur; Michiel Bacchiani
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3202
ER -
Chanwoo Kim, Tara Sainath, Arun Narayanan, Ananya Misra, Rajeev Nongpiur, Michiel Bacchiani. (2018). SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION. IEEE SigPort. http://sigport.org/3202
Chanwoo Kim, Tara Sainath, Arun Narayanan, Ananya Misra, Rajeev Nongpiur, Michiel Bacchiani, 2018. SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION. Available at: http://sigport.org/3202.
Chanwoo Kim, Tara Sainath, Arun Narayanan, Ananya Misra, Rajeev Nongpiur, Michiel Bacchiani. (2018). "SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION." Web.
1. Chanwoo Kim, Tara Sainath, Arun Narayanan, Ananya Misra, Rajeev Nongpiur, Michiel Bacchiani. SPECTRAL DISTORTION MODEL FOR TRAINING PHASE-SENSITIVE DEEP-NEURAL NETWORKS FOR FAR-FIELD SPEECH RECOGNITION [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3202

Spectral feature mapping with mimic loss for robust speech recognition


For the task of speech enhancement, local learning objectives are agnostic to phonetic structures helpful for speech recognition. We propose to add a global criterion to ensure de-noised speech is useful for downstream tasks like ASR. We first train a spectral classifier on clean speech to predict senone labels. Then, the spectral classifier is joined with our speech enhancer as a noisy speech recognizer. This model is taught to imitate the output of the spectral classifier alone on clean speech.

Paper Details

Authors:
Peter Plantinga, Adam Stiff, Eric Fosler-Lussier
Submitted On:
16 April 2018 - 3:17am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp-2018-poster_deblin.pdf

(34 downloads)

Keywords

Subscribe

[1] Peter Plantinga, Adam Stiff, Eric Fosler-Lussier , "Spectral feature mapping with mimic loss for robust speech recognition", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2909. Accessed: Jun. 22, 2018.
@article{2909-18,
url = {http://sigport.org/2909},
author = {Peter Plantinga; Adam Stiff; Eric Fosler-Lussier },
publisher = {IEEE SigPort},
title = {Spectral feature mapping with mimic loss for robust speech recognition},
year = {2018} }
TY - EJOUR
T1 - Spectral feature mapping with mimic loss for robust speech recognition
AU - Peter Plantinga; Adam Stiff; Eric Fosler-Lussier
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2909
ER -
Peter Plantinga, Adam Stiff, Eric Fosler-Lussier . (2018). Spectral feature mapping with mimic loss for robust speech recognition. IEEE SigPort. http://sigport.org/2909
Peter Plantinga, Adam Stiff, Eric Fosler-Lussier , 2018. Spectral feature mapping with mimic loss for robust speech recognition. Available at: http://sigport.org/2909.
Peter Plantinga, Adam Stiff, Eric Fosler-Lussier . (2018). "Spectral feature mapping with mimic loss for robust speech recognition." Web.
1. Peter Plantinga, Adam Stiff, Eric Fosler-Lussier . Spectral feature mapping with mimic loss for robust speech recognition [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2909

EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION


In earlier work we studied the effect of statistical normalisation for phase-based features and observed it leads to a significant robustness improvement. This paper explores the extension of the generalised Vector Taylor Series (gVTS) noise compensation approach to the group delay (GD) domain. We discuss the problems it presents, propose some solutions and derive the corresponding formulae. Furthermore, the effects of additive and channel noise in the GD domain were studied.

Paper Details

Authors:
Erfan Loweimi, Jon Barker, Thomas Hain
Submitted On:
17 April 2018 - 6:16pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Presentation.SLIDES

(27 downloads)

Presentation.MP3

(25 downloads)

Keywords

Subscribe

[1] Erfan Loweimi, Jon Barker, Thomas Hain, "EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2741. Accessed: Jun. 22, 2018.
@article{2741-18,
url = {http://sigport.org/2741},
author = {Erfan Loweimi; Jon Barker; Thomas Hain },
publisher = {IEEE SigPort},
title = {EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION},
year = {2018} }
TY - EJOUR
T1 - EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION
AU - Erfan Loweimi; Jon Barker; Thomas Hain
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2741
ER -
Erfan Loweimi, Jon Barker, Thomas Hain. (2018). EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION. IEEE SigPort. http://sigport.org/2741
Erfan Loweimi, Jon Barker, Thomas Hain, 2018. EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION. Available at: http://sigport.org/2741.
Erfan Loweimi, Jon Barker, Thomas Hain. (2018). "EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION." Web.
1. Erfan Loweimi, Jon Barker, Thomas Hain. EXPLORING THE USE OF GROUP DELAY FOR GENERALISED VTS BASED NOISE COMPENSATION [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2741

Multi-Task Autoencoder For Noise-Robust Speech Recognition


For speech recognition in noisy environments, we propose a multi-task autoencoder which estimates not only clean speech but also noise from noisy speech. We introduce the deSpeeching autoencoder, which excludes speech signals from noisy speech, and combines it with the conventional denoising autoencoder to form a unified multi-task autoencoder (MTAE). We evaluate it using the Aurora 2 data set and 6-hour noise data set collected by ourselves. It reduced WER by 15.7% from the conventional denoising autoencoder in the Aurora 2 test set A.

Paper Details

Authors:
Haoyi Zhang, Conggui Liu, Nakamasa Inoue, Koichi Shinoda
Submitted On:
12 April 2018 - 8:01pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

haoy-icassp18.pdf

(53 downloads)

Keywords

Subscribe

[1] Haoyi Zhang, Conggui Liu, Nakamasa Inoue, Koichi Shinoda, "Multi-Task Autoencoder For Noise-Robust Speech Recognition", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2523. Accessed: Jun. 22, 2018.
@article{2523-18,
url = {http://sigport.org/2523},
author = {Haoyi Zhang; Conggui Liu; Nakamasa Inoue; Koichi Shinoda },
publisher = {IEEE SigPort},
title = {Multi-Task Autoencoder For Noise-Robust Speech Recognition},
year = {2018} }
TY - EJOUR
T1 - Multi-Task Autoencoder For Noise-Robust Speech Recognition
AU - Haoyi Zhang; Conggui Liu; Nakamasa Inoue; Koichi Shinoda
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2523
ER -
Haoyi Zhang, Conggui Liu, Nakamasa Inoue, Koichi Shinoda. (2018). Multi-Task Autoencoder For Noise-Robust Speech Recognition. IEEE SigPort. http://sigport.org/2523
Haoyi Zhang, Conggui Liu, Nakamasa Inoue, Koichi Shinoda, 2018. Multi-Task Autoencoder For Noise-Robust Speech Recognition. Available at: http://sigport.org/2523.
Haoyi Zhang, Conggui Liu, Nakamasa Inoue, Koichi Shinoda. (2018). "Multi-Task Autoencoder For Noise-Robust Speech Recognition." Web.
1. Haoyi Zhang, Conggui Liu, Nakamasa Inoue, Koichi Shinoda. Multi-Task Autoencoder For Noise-Robust Speech Recognition [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2523

Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition


Unsupervised single-channel overlapped speech recognition is one
of the hardest problems in automatic speech recognition (ASR). The
problems can be modularized into three sub-problems: frame-wise
interpreting, sequence level speaker tracing and speech recognition.
Nevertheless, previous acoustic models formulate the correlation between sequential labels implicitly, which limit the modeling effect.
In this work, we include explicit models for the sequential label
correlation during training. This is relevant to models given by both

Paper Details

Authors:
Submitted On:
12 April 2018 - 12:40pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

cocktail icassp2018 oral slides_zhc00.pdf

(35 downloads)

Keywords

Subscribe

[1] , "Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2440. Accessed: Jun. 22, 2018.
@article{2440-18,
url = {http://sigport.org/2440},
author = { },
publisher = {IEEE SigPort},
title = {Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition},
year = {2018} }
TY - EJOUR
T1 - Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2440
ER -
. (2018). Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition. IEEE SigPort. http://sigport.org/2440
, 2018. Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition. Available at: http://sigport.org/2440.
. (2018). "Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition." Web.
1. . Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2440

Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios


This paper addresses the task of Automatic Speech Recognition
(ASR) with music in the background. We consider two different
situations: 1) scenarios with very small amount of labeled training
utterances (duration 1 hour) and 2) scenarios with large amount of
labeled training utterances (duration 132 hours). In these situations,
we aim to achieve robust recognition. To this end we investigate
the following techniques: a) multi-condition training of the acoustic
model, b) denoising autoencoders for feature enhancement and c)

Paper Details

Authors:
Jiri Malek, Jindrich Zdansky, Petr Cerva
Submitted On:
12 April 2018 - 11:32am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2018_Paper1052_MalekZdanskyCerva.pdf

(31 downloads)

Keywords

Subscribe

[1] Jiri Malek, Jindrich Zdansky, Petr Cerva, "Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2404. Accessed: Jun. 22, 2018.
@article{2404-18,
url = {http://sigport.org/2404},
author = {Jiri Malek; Jindrich Zdansky; Petr Cerva },
publisher = {IEEE SigPort},
title = {Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios},
year = {2018} }
TY - EJOUR
T1 - Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios
AU - Jiri Malek; Jindrich Zdansky; Petr Cerva
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2404
ER -
Jiri Malek, Jindrich Zdansky, Petr Cerva. (2018). Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios. IEEE SigPort. http://sigport.org/2404
Jiri Malek, Jindrich Zdansky, Petr Cerva, 2018. Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios. Available at: http://sigport.org/2404.
Jiri Malek, Jindrich Zdansky, Petr Cerva. (2018). "Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios." Web.
1. Jiri Malek, Jindrich Zdansky, Petr Cerva. Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2404

IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION

Paper Details

Authors:
Yan Huang; Yifan Gong
Submitted On:
7 March 2017 - 12:26am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICMMSE_Final.pptx

(159 downloads)

ICMMSE_Final.pptx

(151 downloads)

Keywords

Subscribe

[1] Yan Huang; Yifan Gong, "IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1676. Accessed: Jun. 22, 2018.
@article{1676-17,
url = {http://sigport.org/1676},
author = {Yan Huang; Yifan Gong },
publisher = {IEEE SigPort},
title = {IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION},
year = {2017} }
TY - EJOUR
T1 - IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION
AU - Yan Huang; Yifan Gong
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1676
ER -
Yan Huang; Yifan Gong. (2017). IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION. IEEE SigPort. http://sigport.org/1676
Yan Huang; Yifan Gong, 2017. IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION. Available at: http://sigport.org/1676.
Yan Huang; Yifan Gong. (2017). "IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION." Web.
1. Yan Huang; Yifan Gong. IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1676

Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers


A new approach to online Speech Activity Detection (SAD) is proposed. This approach is designed for the use in a system that carries out 24/7 transcription of radio/TV broadcasts containing a large amount of non-speech segments. To improve the robustness of detection, we adopt Deep Neural Networks (DNNs) trained on artificially created mixtures of speech and non-speech signals at desired levels of Signal-to-Noise Ratio (SNR). An integral part of our approach is an online decoder based on Weighted Finite State Transducers (WFSTs); this decoder smooths the output from DNN.

poster.pdf

PDF icon poster.pdf (370 downloads)

Paper Details

Authors:
Lukas Mateju, Petr Cerva, Jindrich Zdansky, Jiri Malek
Submitted On:
28 February 2017 - 5:04am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

poster.pdf

(370 downloads)

Keywords

Subscribe

[1] Lukas Mateju, Petr Cerva, Jindrich Zdansky, Jiri Malek, "Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1495. Accessed: Jun. 22, 2018.
@article{1495-17,
url = {http://sigport.org/1495},
author = {Lukas Mateju; Petr Cerva; Jindrich Zdansky; Jiri Malek },
publisher = {IEEE SigPort},
title = {Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers},
year = {2017} }
TY - EJOUR
T1 - Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers
AU - Lukas Mateju; Petr Cerva; Jindrich Zdansky; Jiri Malek
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1495
ER -
Lukas Mateju, Petr Cerva, Jindrich Zdansky, Jiri Malek. (2017). Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers. IEEE SigPort. http://sigport.org/1495
Lukas Mateju, Petr Cerva, Jindrich Zdansky, Jiri Malek, 2017. Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers. Available at: http://sigport.org/1495.
Lukas Mateju, Petr Cerva, Jindrich Zdansky, Jiri Malek. (2017). "Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers." Web.
1. Lukas Mateju, Petr Cerva, Jindrich Zdansky, Jiri Malek. Speech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1495

ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS


A multi-stream framework with deep neural network (DNN) classifiers has been applied in this paper to improve automatic speech recognition (ASR) performance in environments with different reverberation characteristics. We propose a room parameter estimation model to determine the stream weights for DNN posterior probability combination with the aim of obtaining reliable log-likelihoods for decoding. The model is implemented by training a multi-layer

Paper Details

Authors:
Feifei Xiong, Stefan Goetze, Bernd T. Meyer
Submitted On:
28 February 2017 - 2:30am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

poster_icassp17_xiongetal.pdf

(187 downloads)

Keywords

Subscribe

[1] Feifei Xiong, Stefan Goetze, Bernd T. Meyer, "ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1482. Accessed: Jun. 22, 2018.
@article{1482-17,
url = {http://sigport.org/1482},
author = {Feifei Xiong; Stefan Goetze; Bernd T. Meyer },
publisher = {IEEE SigPort},
title = {ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS},
year = {2017} }
TY - EJOUR
T1 - ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS
AU - Feifei Xiong; Stefan Goetze; Bernd T. Meyer
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1482
ER -
Feifei Xiong, Stefan Goetze, Bernd T. Meyer. (2017). ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS. IEEE SigPort. http://sigport.org/1482
Feifei Xiong, Stefan Goetze, Bernd T. Meyer, 2017. ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS. Available at: http://sigport.org/1482.
Feifei Xiong, Stefan Goetze, Bernd T. Meyer. (2017). "ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS." Web.
1. Feifei Xiong, Stefan Goetze, Bernd T. Meyer. ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1482

Pages