Sorry, you need to enable JavaScript to visit this website.

ICASSP 2020

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Multi-scale Octave Convolutions for Robust Speech Recognition


We propose a multi-scale octave convolution layer to learn robust speech representations efficiently. Octave convolutions were introduced by Chen et al [1] in the computer vision field to reduce the spatial redundancy of the feature maps by decomposing the output of a convolutional layer into feature maps at two different spatial resolutions, one octave apart. This approach improved the efficiency as well as the accuracy of the CNN models. The accuracy gain was attributed to the enlargement of the receptive field in the original input space.

Paper Details

Authors:
Joanna Rownicka, Peter Bell, Steve Renals
Submitted On:
16 May 2020 - 8:58am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2020_JRownicka_slides.pdf

(42)

Subscribe

[1] Joanna Rownicka, Peter Bell, Steve Renals, "Multi-scale Octave Convolutions for Robust Speech Recognition", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5374. Accessed: Oct. 27, 2020.
@article{5374-20,
url = {http://sigport.org/5374},
author = {Joanna Rownicka; Peter Bell; Steve Renals },
publisher = {IEEE SigPort},
title = {Multi-scale Octave Convolutions for Robust Speech Recognition},
year = {2020} }
TY - EJOUR
T1 - Multi-scale Octave Convolutions for Robust Speech Recognition
AU - Joanna Rownicka; Peter Bell; Steve Renals
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5374
ER -
Joanna Rownicka, Peter Bell, Steve Renals. (2020). Multi-scale Octave Convolutions for Robust Speech Recognition. IEEE SigPort. http://sigport.org/5374
Joanna Rownicka, Peter Bell, Steve Renals, 2020. Multi-scale Octave Convolutions for Robust Speech Recognition. Available at: http://sigport.org/5374.
Joanna Rownicka, Peter Bell, Steve Renals. (2020). "Multi-scale Octave Convolutions for Robust Speech Recognition." Web.
1. Joanna Rownicka, Peter Bell, Steve Renals. Multi-scale Octave Convolutions for Robust Speech Recognition [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5374

PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION

Paper Details

Authors:
Bruno Scalzo, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic
Submitted On:
16 May 2020 - 8:06am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Portfolio_Cuts_Slides.pdf

(52)

Subscribe

[1] Bruno Scalzo, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic, "PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5373. Accessed: Oct. 27, 2020.
@article{5373-20,
url = {http://sigport.org/5373},
author = {Bruno Scalzo; Ljubisa Stankovic; Anthony G. Constantinides; Danilo P. Mandic },
publisher = {IEEE SigPort},
title = {PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION},
year = {2020} }
TY - EJOUR
T1 - PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION
AU - Bruno Scalzo; Ljubisa Stankovic; Anthony G. Constantinides; Danilo P. Mandic
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5373
ER -
Bruno Scalzo, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic. (2020). PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION. IEEE SigPort. http://sigport.org/5373
Bruno Scalzo, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic, 2020. PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION. Available at: http://sigport.org/5373.
Bruno Scalzo, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic. (2020). "PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION." Web.
1. Bruno Scalzo, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic. PORTFOLIO CUTS: A GRAPH-THEORETIC FRAMEWORK TO DIVERSIFICATION [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5373

Teaching Signals and Systems - A First Course in Signal Processing


Signals and systems is a well known fundamental course in signal processing. How this course is taught to a student can spell the difference between whether s/he pursues a career in this field or not. Giving due consideration to this matter, this paper reflects on the experiences in teaching this course. In addition, the authors share the experiences of creating and conducting a Massive Open Online Course (MOOC) on this subject under edX and subsequently following it up with deliberation among some students who did this course through the platform.

Paper Details

Authors:
Nikhar P. Rakhashia, Ankit A. Bhurane, Vikram M. Gadre
Submitted On:
16 May 2020 - 8:02am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Presentation Slides for 'Teaching Signals and Systems - A First Course in Signal Processing'

(77)

Subscribe

[1] Nikhar P. Rakhashia, Ankit A. Bhurane, Vikram M. Gadre, "Teaching Signals and Systems - A First Course in Signal Processing", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5372. Accessed: Oct. 27, 2020.
@article{5372-20,
url = {http://sigport.org/5372},
author = {Nikhar P. Rakhashia; Ankit A. Bhurane; Vikram M. Gadre },
publisher = {IEEE SigPort},
title = {Teaching Signals and Systems - A First Course in Signal Processing},
year = {2020} }
TY - EJOUR
T1 - Teaching Signals and Systems - A First Course in Signal Processing
AU - Nikhar P. Rakhashia; Ankit A. Bhurane; Vikram M. Gadre
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5372
ER -
Nikhar P. Rakhashia, Ankit A. Bhurane, Vikram M. Gadre. (2020). Teaching Signals and Systems - A First Course in Signal Processing. IEEE SigPort. http://sigport.org/5372
Nikhar P. Rakhashia, Ankit A. Bhurane, Vikram M. Gadre, 2020. Teaching Signals and Systems - A First Course in Signal Processing. Available at: http://sigport.org/5372.
Nikhar P. Rakhashia, Ankit A. Bhurane, Vikram M. Gadre. (2020). "Teaching Signals and Systems - A First Course in Signal Processing." Web.
1. Nikhar P. Rakhashia, Ankit A. Bhurane, Vikram M. Gadre. Teaching Signals and Systems - A First Course in Signal Processing [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5372

JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM


Low light images suffer from low dynamic range and severe noise due to low signal-to-noise ratio (SNR). In this paper, we propose joint enhancement and denoising of low light images via justnoticeable-difference (JND) transform. We achieve contrast enhancement and noise reduction simultaneously based on human visual perception. First, we perform contrast enhancement based on perceptual histogram to effectively allocate a dynamic range while preventing over-enhancement. Second, we generate JND map based on an HVS response model from foreground and background luminance, called JND transform.

Paper Details

Authors:
Long Yu, Haonan Su, Cheolkon Jung
Submitted On:
16 May 2020 - 6:14am
Short Link:
Type:
Event:

Document Files

ICASSP_2020.ppt

(65)

Subscribe

[1] Long Yu, Haonan Su, Cheolkon Jung, "JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5371. Accessed: Oct. 27, 2020.
@article{5371-20,
url = {http://sigport.org/5371},
author = {Long Yu; Haonan Su; Cheolkon Jung },
publisher = {IEEE SigPort},
title = {JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM},
year = {2020} }
TY - EJOUR
T1 - JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM
AU - Long Yu; Haonan Su; Cheolkon Jung
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5371
ER -
Long Yu, Haonan Su, Cheolkon Jung. (2020). JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM. IEEE SigPort. http://sigport.org/5371
Long Yu, Haonan Su, Cheolkon Jung, 2020. JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM. Available at: http://sigport.org/5371.
Long Yu, Haonan Su, Cheolkon Jung. (2020). "JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM." Web.
1. Long Yu, Haonan Su, Cheolkon Jung. JOINT ENHANCEMENT AND DENOISING OF LOW LIGHT IMAGES VIA JND TRANSFORM [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5371

Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering


Multi-modality fusion technologies have greatly improved the performance of neural network-based Video Description/Caption, Visual Question Answering (VQA) and Audio Visual Scene-aware Di-alog (AVSD) over the recent years. Most previous approaches only explore the last layers of multiple layer feature fusion while omit-ting the importance of intermediate layers. To solve the issue for the intermediate layers, we propose an efficient Quaternion Block Net-work (QBN) to learn interaction not only for the last layer but also for all intermediate layers simultaneously.

Paper Details

Authors:
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su
Submitted On:
16 May 2020 - 1:27am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

Icassp2020_Multi-Layer_Content_Interaction_Through_Quaternion_Product_for_Visual_Question_Answering

(42)

Subscribe

[1] Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su, "Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5370. Accessed: Oct. 27, 2020.
@article{5370-20,
url = {http://sigport.org/5370},
author = {Lei shi;Shijie Geng; Kai Shuang;Chiori Hori; Songxiang Liu; Peng Gao; Sen Su },
publisher = {IEEE SigPort},
title = {Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering},
year = {2020} }
TY - EJOUR
T1 - Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering
AU - Lei shi;Shijie Geng; Kai Shuang;Chiori Hori; Songxiang Liu; Peng Gao; Sen Su
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5370
ER -
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su. (2020). Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering. IEEE SigPort. http://sigport.org/5370
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su, 2020. Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering. Available at: http://sigport.org/5370.
Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su. (2020). "Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering." Web.
1. Lei shi,Shijie Geng, Kai Shuang,Chiori Hori, Songxiang Liu, Peng Gao, Sen Su. Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5370

WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION


The presence of auditory and visual senses enables humans to obtain a profound understanding of the real-world scenes. While audio and visual signals are capable of providing scene knowledge individually, the combination of both offers a better insight about the underlying event. In this paper, we address the problem of audio-visual event localization where the goal is to identify the presence of an event that is both audible and visible in a video, using fully or weakly supervised learning.

Paper Details

Authors:
Janani Ramaswamy
Submitted On:
16 May 2020 - 1:09am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

What_makes_the_sound_ICASSP2020.pdf

(50)

Subscribe

[1] Janani Ramaswamy, "WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5369. Accessed: Oct. 27, 2020.
@article{5369-20,
url = {http://sigport.org/5369},
author = {Janani Ramaswamy },
publisher = {IEEE SigPort},
title = {WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION},
year = {2020} }
TY - EJOUR
T1 - WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION
AU - Janani Ramaswamy
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5369
ER -
Janani Ramaswamy. (2020). WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION. IEEE SigPort. http://sigport.org/5369
Janani Ramaswamy, 2020. WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION. Available at: http://sigport.org/5369.
Janani Ramaswamy. (2020). "WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION." Web.
1. Janani Ramaswamy. WHAT MAKES THE SOUND?: A DUAL-MODALITY INTERACTING NETWORK FOR AUDIO-VISUAL EVENT LOCALIZATION [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5369

Expression Guided EEG Representation Learning for Emotion Recognition


Learning a joint and coordinated representation between different modalities can improve multimodal emotion recognition. In this paper, we propose a deep representation learning approach for emotion recognition from electroencephalogram (EEG) signals guided by facial electromyogram (EMG) and electrooculogram (EOG) signals. We recorded EEG, EMG and EOG signals from 60 participants who watched 40 short videos and self-reported their emotions.

Paper Details

Authors:
Soheil Rayatdoost, David Rudrauf, Mohammad Soleymani
Submitted On:
16 May 2020 - 12:55am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

rayatdoost_ICASSP_17_04.pdf

(65)

Subscribe

[1] Soheil Rayatdoost, David Rudrauf, Mohammad Soleymani, "Expression Guided EEG Representation Learning for Emotion Recognition", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5368. Accessed: Oct. 27, 2020.
@article{5368-20,
url = {http://sigport.org/5368},
author = {Soheil Rayatdoost; David Rudrauf; Mohammad Soleymani },
publisher = {IEEE SigPort},
title = {Expression Guided EEG Representation Learning for Emotion Recognition},
year = {2020} }
TY - EJOUR
T1 - Expression Guided EEG Representation Learning for Emotion Recognition
AU - Soheil Rayatdoost; David Rudrauf; Mohammad Soleymani
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5368
ER -
Soheil Rayatdoost, David Rudrauf, Mohammad Soleymani. (2020). Expression Guided EEG Representation Learning for Emotion Recognition. IEEE SigPort. http://sigport.org/5368
Soheil Rayatdoost, David Rudrauf, Mohammad Soleymani, 2020. Expression Guided EEG Representation Learning for Emotion Recognition. Available at: http://sigport.org/5368.
Soheil Rayatdoost, David Rudrauf, Mohammad Soleymani. (2020). "Expression Guided EEG Representation Learning for Emotion Recognition." Web.
1. Soheil Rayatdoost, David Rudrauf, Mohammad Soleymani. Expression Guided EEG Representation Learning for Emotion Recognition [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5368

Multi-Patch Aggregation Models for Resampling Detection


Images captured nowadays are of varying dimensions with smartphones and DSLR’s allowing users to choose from a list of available image resolutions. It is therefore imperative for forensic algorithms such as resampling detection to scale well for images of varying dimensions. However, in our experiments we observed that many state-of-the-art forensic algorithms are sensitive to image size and their performance quickly degenerates when operated on images of diverse dimensions despite re-training them using multiple image sizes.

Paper Details

Authors:
Mohit Lamba, Kaushik Mitra
Submitted On:
16 May 2020 - 12:48am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Presentation Slides for the project

(46)

Keywords

Additional Categories

Subscribe

[1] Mohit Lamba, Kaushik Mitra, "Multi-Patch Aggregation Models for Resampling Detection", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5367. Accessed: Oct. 27, 2020.
@article{5367-20,
url = {http://sigport.org/5367},
author = {Mohit Lamba; Kaushik Mitra },
publisher = {IEEE SigPort},
title = {Multi-Patch Aggregation Models for Resampling Detection},
year = {2020} }
TY - EJOUR
T1 - Multi-Patch Aggregation Models for Resampling Detection
AU - Mohit Lamba; Kaushik Mitra
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5367
ER -
Mohit Lamba, Kaushik Mitra. (2020). Multi-Patch Aggregation Models for Resampling Detection. IEEE SigPort. http://sigport.org/5367
Mohit Lamba, Kaushik Mitra, 2020. Multi-Patch Aggregation Models for Resampling Detection. Available at: http://sigport.org/5367.
Mohit Lamba, Kaushik Mitra. (2020). "Multi-Patch Aggregation Models for Resampling Detection." Web.
1. Mohit Lamba, Kaushik Mitra. Multi-Patch Aggregation Models for Resampling Detection [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5367

COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION


In this study, deep embedding of acoustic and articulatory features are combined for speaker identification. First, a convolutional neural network (CNN)-based universal background model (UBM) is constructed to generate acoustic feature (AC) embedding. In addition, as the articulatory features (AFs) represent some important phonological properties during speech production, a multilayer perceptron (MLP)-based AF embedding extraction model is also constructed for AF embedding extraction.

Paper Details

Authors:
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang
Submitted On:
16 May 2020 - 12:02am
Short Link:
Type:
Event:

Document Files

20200419_ICASSP_paper2.pdf

(63)

Subscribe

[1] Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang, "COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5366. Accessed: Oct. 27, 2020.
@article{5366-20,
url = {http://sigport.org/5366},
author = {Qian-Bei Hong; Chung-Hsien Wu; Hsin-Min Wang; Chien-Lin Huang },
publisher = {IEEE SigPort},
title = {COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION},
year = {2020} }
TY - EJOUR
T1 - COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION
AU - Qian-Bei Hong; Chung-Hsien Wu; Hsin-Min Wang; Chien-Lin Huang
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5366
ER -
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang. (2020). COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION. IEEE SigPort. http://sigport.org/5366
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang, 2020. COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION. Available at: http://sigport.org/5366.
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang. (2020). "COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION." Web.
1. Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang. COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5366

Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification


This paper aims to improve speaker embedding representation based on x-vector for extracting more detailed information for speaker verification. We propose a statistics pooling time delay neural network (TDNN), in which the TDNN structure integrates statistics pooling for each layer, to consider the variation of temporal context in frame-level transformation. The proposed feature vector, named as stats-vector, are compared with the baseline x-vector features on the VoxCeleb dataset and the Speakers in the Wild (SITW) dataset for speaker verification.

Paper Details

Authors:
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang
Submitted On:
15 May 2020 - 11:55pm
Short Link:
Type:
Event:

Document Files

20200419_ICASSP_Experiment 1.pdf

(72)

Subscribe

[1] Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang, "Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5365. Accessed: Oct. 27, 2020.
@article{5365-20,
url = {http://sigport.org/5365},
author = {Qian-Bei Hong; Chung-Hsien Wu; Hsin-Min Wang; Chien-Lin Huang },
publisher = {IEEE SigPort},
title = {Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification},
year = {2020} }
TY - EJOUR
T1 - Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification
AU - Qian-Bei Hong; Chung-Hsien Wu; Hsin-Min Wang; Chien-Lin Huang
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5365
ER -
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang. (2020). Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification. IEEE SigPort. http://sigport.org/5365
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang, 2020. Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification. Available at: http://sigport.org/5365.
Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang. (2020). "Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification." Web.
1. Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang. Statistics Pooling Time Delay Neural Network Based on X-vector for Speaker Verification [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5365

Pages