Sorry, you need to enable JavaScript to visit this website.

ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website

Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder

Paper Details

Authors:
Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino
Submitted On:
14 May 2019 - 5:42pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

AASP_L4_2.pdf

(27)

Subscribe

[1] Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino, "Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4514. Accessed: Jul. 19, 2019.
@article{4514-19,
url = {http://sigport.org/4514},
author = {Hirokazu Kameoka; Li Li; Shogo Seki; Shoji Makino },
publisher = {IEEE SigPort},
title = {Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder},
year = {2019} }
TY - EJOUR
T1 - Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder
AU - Hirokazu Kameoka; Li Li; Shogo Seki; Shoji Makino
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4514
ER -
Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino. (2019). Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder. IEEE SigPort. http://sigport.org/4514
Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino, 2019. Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder. Available at: http://sigport.org/4514.
Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino. (2019). "Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder." Web.
1. Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino. Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4514

Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data

Paper Details

Authors:
Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu
Submitted On:
15 May 2019 - 3:11am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Slides.pdf

(642)

Subscribe

[1] Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu, "Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4513. Accessed: Jul. 19, 2019.
@article{4513-19,
url = {http://sigport.org/4513},
author = {Jun Wang; Dan Su; Jie Chen; Shulin Feng; Dongpeng Ma; Na Li; Dong Yu },
publisher = {IEEE SigPort},
title = {Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data},
year = {2019} }
TY - EJOUR
T1 - Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data
AU - Jun Wang; Dan Su; Jie Chen; Shulin Feng; Dongpeng Ma; Na Li; Dong Yu
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4513
ER -
Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu. (2019). Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data. IEEE SigPort. http://sigport.org/4513
Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu, 2019. Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data. Available at: http://sigport.org/4513.
Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu. (2019). "Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data." Web.
1. Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu. Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4513

INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION


Conformal prediction uses the degree of strangeness (nonconformity) of new data instances to determine the confidence values of new predictions. We propose an inductive conformal predictor for sparse coding classifiers, referred to as ICP-SCC. Our contribution is twofold: first, we present two nonconformitymeasures that produce reliable confidence values; second, we propose a batchmode active learning algorithm within the conformal prediction framework to improve classification performance by selecting training instances based on two criteria, informativeness and diversity.

Paper Details

Authors:
Kenneth E. Barner
Submitted On:
14 May 2019 - 10:41am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Poster ICASSP 2019

(18)

Subscribe

[1] Kenneth E. Barner, "INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4512. Accessed: Jul. 19, 2019.
@article{4512-19,
url = {http://sigport.org/4512},
author = {Kenneth E. Barner },
publisher = {IEEE SigPort},
title = {INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION},
year = {2019} }
TY - EJOUR
T1 - INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION
AU - Kenneth E. Barner
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4512
ER -
Kenneth E. Barner. (2019). INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION. IEEE SigPort. http://sigport.org/4512
Kenneth E. Barner, 2019. INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION. Available at: http://sigport.org/4512.
Kenneth E. Barner. (2019). "INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION." Web.
1. Kenneth E. Barner. INDUCTIVE CONFORMAL PREDICTOR FOR SPARSE CODING CLASSIFIERS: APPLICATIONS TO IMAGE CLASSIFICATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4512

Phoneme Level Language Models for Sequence Based Low Resource ASR


Building multilingual and crosslingual models help bring different languages together in a language universal space. It allows models to share parameters and transfer knowledge across languages, enabling faster and better adaptation to a new language. These approaches are particularly useful for low resource languages. In this paper, we propose a phoneme-level language model that can be used multilingually and for crosslingual adaptation to a target language.

Paper Details

Authors:
Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze
Submitted On:
14 May 2019 - 10:39am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

PLMs_ICASSP_Poster (1).pdf

(23)

Subscribe

[1] Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze, "Phoneme Level Language Models for Sequence Based Low Resource ASR", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4511. Accessed: Jul. 19, 2019.
@article{4511-19,
url = {http://sigport.org/4511},
author = {Siddharth Dalmia; Xinjian Li; Alan W Black; Florian Metze },
publisher = {IEEE SigPort},
title = {Phoneme Level Language Models for Sequence Based Low Resource ASR},
year = {2019} }
TY - EJOUR
T1 - Phoneme Level Language Models for Sequence Based Low Resource ASR
AU - Siddharth Dalmia; Xinjian Li; Alan W Black; Florian Metze
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4511
ER -
Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze. (2019). Phoneme Level Language Models for Sequence Based Low Resource ASR. IEEE SigPort. http://sigport.org/4511
Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze, 2019. Phoneme Level Language Models for Sequence Based Low Resource ASR. Available at: http://sigport.org/4511.
Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze. (2019). "Phoneme Level Language Models for Sequence Based Low Resource ASR." Web.
1. Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze. Phoneme Level Language Models for Sequence Based Low Resource ASR [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4511

One-Bit Unlimited Sampling


Conventional analog–to–digital converters (ADCs) are limited in dynamic range. If a signal exceeds some prefixed threshold, the ADC saturates and the resulting signal is clipped, thus becoming prone to aliasing artifacts. Recent developments in ADC design allow to overcome this limitation: using modulo operation, the so called self-reset ADCs fold amplitudes which exceed the dynamic range. A new (unlimited) sampling theory is currently being developed in the context of this novel class of ADCs.

Paper Details

Authors:
Felix Krahmer
Submitted On:
14 May 2019 - 10:46am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

ICASSP19_GBK.pdf

(18)

Subscribe

[1] Felix Krahmer, "One-Bit Unlimited Sampling", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4510. Accessed: Jul. 19, 2019.
@article{4510-19,
url = {http://sigport.org/4510},
author = {Felix Krahmer },
publisher = {IEEE SigPort},
title = {One-Bit Unlimited Sampling},
year = {2019} }
TY - EJOUR
T1 - One-Bit Unlimited Sampling
AU - Felix Krahmer
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4510
ER -
Felix Krahmer. (2019). One-Bit Unlimited Sampling. IEEE SigPort. http://sigport.org/4510
Felix Krahmer, 2019. One-Bit Unlimited Sampling. Available at: http://sigport.org/4510.
Felix Krahmer. (2019). "One-Bit Unlimited Sampling." Web.
1. Felix Krahmer. One-Bit Unlimited Sampling [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4510

Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition


In this paper, we experiment with the recently introduced subword regularization technique \cite{kudo2018subword} in the context of end-to-end automatic speech recognition (ASR). We present results from both attention-based and CTC-based ASR systems on two common benchmark datasets, the 80 hour Wall Street Journal corpus and 1,000 hour Librispeech corpus. We also introduce a novel subword beam search decoding algorithm that significantly improves the final performance of the CTC-based systems.

Paper Details

Authors:
Jennifer Drexler, James Glass
Submitted On:
14 May 2019 - 9:04am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_poster_final.pdf

(18)

Subscribe

[1] Jennifer Drexler, James Glass, "Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4509. Accessed: Jul. 19, 2019.
@article{4509-19,
url = {http://sigport.org/4509},
author = {Jennifer Drexler; James Glass },
publisher = {IEEE SigPort},
title = {Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition},
year = {2019} }
TY - EJOUR
T1 - Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition
AU - Jennifer Drexler; James Glass
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4509
ER -
Jennifer Drexler, James Glass. (2019). Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition. IEEE SigPort. http://sigport.org/4509
Jennifer Drexler, James Glass, 2019. Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition. Available at: http://sigport.org/4509.
Jennifer Drexler, James Glass. (2019). "Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition." Web.
1. Jennifer Drexler, James Glass. Subword Regularization and Beam Search Decoding for End-to-End Automatic Speech Recognition [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4509

AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS


In recent years, the number and variety of heterogeneous multiprocessor system-on-chip MPSoCs, such as for instance Zynq platforms, has sensibly increased. However, today all design flow solutions capable of programming the different components of such platforms require to the designer either to modify the software or hardware based designs to obtain higher performance implementations. Thus, the developer needs to either rewrite functional blocks in HDL or to use high-level synthesis of C-like sequential languages with platform locked extensions.

Paper Details

Authors:
Simone Casale Brunet, Romuald Mosqueron, Marco Mattavelli
Submitted On:
14 May 2019 - 8:07am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

main-poster.pdf

(10)

Subscribe

[1] Simone Casale Brunet, Romuald Mosqueron, Marco Mattavelli, "AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4508. Accessed: Jul. 19, 2019.
@article{4508-19,
url = {http://sigport.org/4508},
author = {Simone Casale Brunet; Romuald Mosqueron; Marco Mattavelli },
publisher = {IEEE SigPort},
title = {AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS},
year = {2019} }
TY - EJOUR
T1 - AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS
AU - Simone Casale Brunet; Romuald Mosqueron; Marco Mattavelli
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4508
ER -
Simone Casale Brunet, Romuald Mosqueron, Marco Mattavelli. (2019). AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS. IEEE SigPort. http://sigport.org/4508
Simone Casale Brunet, Romuald Mosqueron, Marco Mattavelli, 2019. AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS. Available at: http://sigport.org/4508.
Simone Casale Brunet, Romuald Mosqueron, Marco Mattavelli. (2019). "AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS." Web.
1. Simone Casale Brunet, Romuald Mosqueron, Marco Mattavelli. AN HETEROGENEOUS COMPILER OF DATAFLOW PROGRAMS FOR ZYNQ PLATFORMS [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4508

ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION

Paper Details

Authors:
Karen Livescu, Michael Picheny
Submitted On:
14 May 2019 - 7:08am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp_official_final.pdf

(24)

Subscribe

[1] Karen Livescu, Michael Picheny, "ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4506. Accessed: Jul. 19, 2019.
@article{4506-19,
url = {http://sigport.org/4506},
author = {Karen Livescu; Michael Picheny },
publisher = {IEEE SigPort},
title = {ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION},
year = {2019} }
TY - EJOUR
T1 - ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION
AU - Karen Livescu; Michael Picheny
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4506
ER -
Karen Livescu, Michael Picheny. (2019). ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION. IEEE SigPort. http://sigport.org/4506
Karen Livescu, Michael Picheny, 2019. ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION. Available at: http://sigport.org/4506.
Karen Livescu, Michael Picheny. (2019). "ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION." Web.
1. Karen Livescu, Michael Picheny. ACOUSTICALLY GROUNDED WORD EMBEDDINGS FOR IMPROVED ACOUSTICS-TO-WORD SPEECH RECOGNITION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4506

DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS


The huge volume of data that are available today requires data-
selective processing approaches that avoid the costs in computa-
tional complexity via appropriately treating the non-innovative data.
In this paper, extensions of the well-known adaptive filtering LMS-
Newton and LMS-Quasi-Newton Algorithms are developed that
enable data selection while also addressing the censorship of out-
liers that emerge due to high measurement errors. The proposed
solutions allow the prescription of how often the acquired data are

Paper Details

Authors:
Submitted On:
14 May 2019 - 5:42am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

presentation_tsinos.pdf

(19)

Subscribe

[1] , "DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4505. Accessed: Jul. 19, 2019.
@article{4505-19,
url = {http://sigport.org/4505},
author = { },
publisher = {IEEE SigPort},
title = {DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS},
year = {2019} }
TY - EJOUR
T1 - DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4505
ER -
. (2019). DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS. IEEE SigPort. http://sigport.org/4505
, 2019. DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS. Available at: http://sigport.org/4505.
. (2019). "DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS." Web.
1. . DATA-SELECTIVE LMS-NEWTON AND LMS-QUASI-NEWTON ALGORITHMS [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4505

AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION


Despite the recent success of multi-modal action recognition in videos, in reality, we usually confront the situation that some data are not available beforehand, especially for multimodal data. For example, while vision and audio data are required to address the multi-modal action recognition, audio tracks in videos are easily lost due to the broken files or the limitation of devices. To cope with this sound-missing problem, we present an approach to simulating deep audio feature from merely spatial-temporal vision data.

Paper Details

Authors:
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu
Submitted On:
14 May 2019 - 5:08am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:

Document Files

20190516_AUDIO_FEATURE_GENERATION_FOR_MISSING_MODALITY_PROBLEM_IN_VIDEO_ACTION_RECOGNITION.pptx

(21)

Subscribe

[1] Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu, "AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4504. Accessed: Jul. 19, 2019.
@article{4504-19,
url = {http://sigport.org/4504},
author = {Hu-Cheng Lee; Chih-Yu Lin; Pin-Chun Hsu; Winston H. Hsu },
publisher = {IEEE SigPort},
title = {AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION},
year = {2019} }
TY - EJOUR
T1 - AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION
AU - Hu-Cheng Lee; Chih-Yu Lin; Pin-Chun Hsu; Winston H. Hsu
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4504
ER -
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu. (2019). AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION. IEEE SigPort. http://sigport.org/4504
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu, 2019. AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION. Available at: http://sigport.org/4504.
Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu. (2019). "AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION." Web.
1. Hu-Cheng Lee, Chih-Yu Lin, Pin-Chun Hsu, Winston H. Hsu. AUDIO FEATURE GENERATION FOR MISSING MODALITY PROBLEM IN VIDEO ACTION RECOGNITION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4504

Pages