Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

Exploiting Language-Mismatched Phoneme Recognizers for Unsupervised Acoustic Modeling

Read more about Exploiting Language-Mismatched Phoneme Recognizers for Unsupervised Acoustic Modeling
Log in to post comments

This paper describes an investigation on acoustic modeling in the absence of transcribed training data. We propose to use language-mismatched phoneme recognizers to assist unsupervised segmentation and segment clustering of a new language. Using a language-mismatched recognizer, an input utterance is divided into many variable-length segments. Each segment is represented by a feature vector that is derived from the phoneme posterior probabilities.

slides.pdf

slides.pdf (885)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

23 Views

EXPLOITING LSTM STRUCTURE IN DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION

Read more about EXPLOITING LSTM STRUCTURE IN DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION
Log in to post comments

txh18-he-icassp16-formal-poster.pptx

txh18-he-icassp16-formal-poster.pptx (469)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

12 Views

Self-Stabilized Deep Neural Net Poster

Read more about Self-Stabilized Deep Neural Net Poster
Log in to post comments

selflr_icassp_poster.pdf

selflr_icassp_poster.pdf (790)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

10 Views

Compact Kernel Models for Acoustic Modeling via Random Feature Selection

Read more about Compact Kernel Models for Acoustic Modeling via Random Feature Selection
Log in to post comments

A simple but effective method is proposed for learning compact random feature models that approximate non-linear kernel methods, in the context of acoustic modeling. The method is able to explore a large number of non-linear features while maintaining a compact model via feature selection more efficiently than existing approaches. For certain kernels, this random feature selection may be regarded as a means of non-linear feature selection at the level of the raw input features, which motivates additional methods for computational improvements.

ICASSP_Poster_Avner_May_v3.pdf

ICASSP_Poster_Avner_May_v3.pdf (879)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

10 Views

Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models

mfy-icassp16-slides.pdf

mfy-icassp16-slides.pdf (731)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

19 Views

Recurrent SVM for Speech Recognition

Read more about Recurrent SVM for Speech Recognition
Log in to post comments

RecurrentSVM_poster.pdf

RecurrentSVM_poster.pdf (352)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

17 Views

On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition

Recently, there has been an increasing interest in end-to-end speech
recognition using neural networks, with no reliance on hidden
Markov models (HMMs) for sequence modelling as in the standard
hybrid framework. The recurrent neural network (RNN) encoder-decoder
is such a model, performing sequence to sequence mapping
without any predefined alignment. This model first transforms the
input sequence into a fixed length vector representation, from which
the decoder recovers the output sequence. In this paper, we extend

liang_icassp16_slides.pdf

liang_icassp16_slides.pdf (730)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

20 Views

Deep convolutional acoustic word embeddings using word-pair side information

Read more about Deep convolutional acoustic word embeddings using word-pair side information
Log in to post comments

Recent studies have been revisiting whole words as the basic modelling unit in speech recognition and query applications, instead of phonetic units. Such whole-word segmental systems rely on a function that maps a variable-length speech segment to a vector in a fixed-dimensional space; the resulting acoustic word embeddings need to allow for accurate discrimination between different word types, directly in the embedding space. We compare several old and new approaches in a word discrimination task.

kamper+wang+livescu_icassp2016_talk.pdf

kamper+wang+livescu_icassp2016_talk.pdf (86)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

6 Views

Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder

Read more about Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder
Log in to post comments

Ever since the deep neural network (DNN)-based acoustic model appeared, the recognition performance of automatic peech recognition has been greatly improved. Due to this achievement, various researches on DNN-based technique for noise robustness are also in progress. Among these approaches, the noise-aware training (NAT) technique which aims to improve the inherent robustness of DNN using noise estimates has shown remarkable performance. However, despite the great performance, we cannot be certain whether NAT is an optimal method for sufficiently utilizing the inherent robustness of DNN.

ICASSP2016_포스터_이강현_그래프2.pdf

ICASSP2016_포스터_이강현_그래프2.pdf (69)

Categories:: Robust Speech Recognition (SPE-ROBU)
Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

31 Views

Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

Pages