Sorry, you need to enable JavaScript to visit this website.

Speaker Recognition and Characterization (SPE-SPKR)

Speaker Diarization with LSTM


For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications. However, mirroring the rise of deep learning in various domains, neural network based audio embeddings, also known as d-vectors, have consistently demonstrated superior speaker verification performance. In this paper, we build on the success of d-vector based speaker verification systems to develop a new d-vector based approach to speaker diarization.

Paper Details

Authors:
Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno
Submitted On:
12 April 2018 - 11:54am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp2018_poster_quan_diarization

(170)

Subscribe

[1] Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno, "Speaker Diarization with LSTM", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2421. Accessed: Nov. 21, 2019.
@article{2421-18,
url = {http://sigport.org/2421},
author = {Quan Wang; Carlton Downey; Li Wan; Philip Andrew Mansfield; Ignacio Lopez Moreno },
publisher = {IEEE SigPort},
title = {Speaker Diarization with LSTM},
year = {2018} }
TY - EJOUR
T1 - Speaker Diarization with LSTM
AU - Quan Wang; Carlton Downey; Li Wan; Philip Andrew Mansfield; Ignacio Lopez Moreno
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2421
ER -
Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno. (2018). Speaker Diarization with LSTM. IEEE SigPort. http://sigport.org/2421
Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno, 2018. Speaker Diarization with LSTM. Available at: http://sigport.org/2421.
Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno. (2018). "Speaker Diarization with LSTM." Web.
1. Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno. Speaker Diarization with LSTM [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2421

ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION


Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end text-dependent speaker recognition system. We explore different topologies and their variants of the attention layer, and compare different pooling methods on the attention weights.

Paper Details

Authors:
F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan
Submitted On:
12 April 2018 - 11:42am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp2018_poster_reza_attention

(166)

Subscribe

[1] F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan, "ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2413. Accessed: Nov. 21, 2019.
@article{2413-18,
url = {http://sigport.org/2413},
author = {F A Rezaur Rahman Chowdhury; Quan Wang; Ignacio Lopez Moreno; Li Wan },
publisher = {IEEE SigPort},
title = {ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION},
year = {2018} }
TY - EJOUR
T1 - ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
AU - F A Rezaur Rahman Chowdhury; Quan Wang; Ignacio Lopez Moreno; Li Wan
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2413
ER -
F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan. (2018). ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION. IEEE SigPort. http://sigport.org/2413
F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan, 2018. ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION. Available at: http://sigport.org/2413.
F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan. (2018). "ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION." Web.
1. F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan. ATTENTION-BASED MODELS FOR TEXT-DEPENDENT SPEAKER VERIFICATION [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2413

A new approach for robust replay spoof detection in ASV systems


The objective of this paper is to extract robust features for
detecting replay spoof attacks on text-independent speaker
verification systems. In the case of replay attacks, prere-
corded utterance of the target speaker is played to the auto-
matic speaker verification system (ASV)to gain unauthorized
access. In such a scenario, the speech signal carries the char-
acteristics of the intermediate recording device as well. In the
proposed approach, the characteristics of the intermediate de-

Paper Details

Authors:
B Shaik Mohammad Rafi
Submitted On:
11 November 2017 - 8:08am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

A new approach for robust replay spoof detection in ASV systems.pdf

(30)

Subscribe

[1] B Shaik Mohammad Rafi, "A new approach for robust replay spoof detection in ASV systems", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/2306. Accessed: Nov. 21, 2019.
@article{2306-17,
url = {http://sigport.org/2306},
author = {B Shaik Mohammad Rafi },
publisher = {IEEE SigPort},
title = {A new approach for robust replay spoof detection in ASV systems},
year = {2017} }
TY - EJOUR
T1 - A new approach for robust replay spoof detection in ASV systems
AU - B Shaik Mohammad Rafi
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/2306
ER -
B Shaik Mohammad Rafi. (2017). A new approach for robust replay spoof detection in ASV systems. IEEE SigPort. http://sigport.org/2306
B Shaik Mohammad Rafi, 2017. A new approach for robust replay spoof detection in ASV systems. Available at: http://sigport.org/2306.
B Shaik Mohammad Rafi. (2017). "A new approach for robust replay spoof detection in ASV systems." Web.
1. B Shaik Mohammad Rafi. A new approach for robust replay spoof detection in ASV systems [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/2306

I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS


i-Vector feature representation with probabilistic linear discriminant analysis (PLDA) scoring in speaker recognition system has recently achieved effective performance even on channel mismatch conditions. In general, experiments carried out using this combined strategy employ linear discriminant analysis (LDA) after the i-Vector extraction phase to suppress irrelevant directions, such as those introduced by noise or channel distortions. However, speaker-related and -non-related variability present in the data may prevent LDA from finding the best projection matrix.

Paper Details

Authors:
Fahimeh Bahmaninezhad, John H.L. Hansen
Submitted On:
25 March 2017 - 2:33am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Poster-Slides

(304)

Paper

(310)

Subscribe

[1] Fahimeh Bahmaninezhad, John H.L. Hansen, "I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1744. Accessed: Nov. 21, 2019.
@article{1744-17,
url = {http://sigport.org/1744},
author = {Fahimeh Bahmaninezhad; John H.L. Hansen },
publisher = {IEEE SigPort},
title = {I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS},
year = {2017} }
TY - EJOUR
T1 - I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS
AU - Fahimeh Bahmaninezhad; John H.L. Hansen
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1744
ER -
Fahimeh Bahmaninezhad, John H.L. Hansen. (2017). I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS. IEEE SigPort. http://sigport.org/1744
Fahimeh Bahmaninezhad, John H.L. Hansen, 2017. I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS. Available at: http://sigport.org/1744.
Fahimeh Bahmaninezhad, John H.L. Hansen. (2017). "I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS." Web.
1. Fahimeh Bahmaninezhad, John H.L. Hansen. I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1744

APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK


We propose a method to improve speaker verification performance when a test utterance is very short. In some situations with short test utterances, performance of i-vector/probabilistic linear discriminant analysis systems degrades. The proposed method transforms short-utterance feature vectors to adequate vectors using a deep neural network, which compensate for short utterances.

Paper Details

Authors:
IL-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, and Ha-Jin Yu
Submitted On:
8 March 2017 - 11:53pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:

Document Files

poster.pdf

(903)

Subscribe

[1] IL-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, and Ha-Jin Yu, "APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1715. Accessed: Nov. 21, 2019.
@article{1715-17,
url = {http://sigport.org/1715},
author = {IL-Ho Yang; Hee-Soo Heo; Sung-Hyun Yoon; and Ha-Jin Yu },
publisher = {IEEE SigPort},
title = {APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK},
year = {2017} }
TY - EJOUR
T1 - APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK
AU - IL-Ho Yang; Hee-Soo Heo; Sung-Hyun Yoon; and Ha-Jin Yu
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1715
ER -
IL-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, and Ha-Jin Yu. (2017). APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK. IEEE SigPort. http://sigport.org/1715
IL-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, and Ha-Jin Yu, 2017. APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK. Available at: http://sigport.org/1715.
IL-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, and Ha-Jin Yu. (2017). "APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK." Web.
1. IL-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, and Ha-Jin Yu. APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1715

DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS

Paper Details

Authors:
Rosanna Milner, Thomas Hain
Submitted On:
7 March 2017 - 8:40am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

talk-dia-icassp17-milner.pdf

(274)

talk-dia-icassp17-milner.pdf

(262)

Subscribe

[1] Rosanna Milner, Thomas Hain, "DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1677. Accessed: Nov. 21, 2019.
@article{1677-17,
url = {http://sigport.org/1677},
author = {Rosanna Milner; Thomas Hain },
publisher = {IEEE SigPort},
title = {DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS},
year = {2017} }
TY - EJOUR
T1 - DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS
AU - Rosanna Milner; Thomas Hain
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1677
ER -
Rosanna Milner, Thomas Hain. (2017). DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS. IEEE SigPort. http://sigport.org/1677
Rosanna Milner, Thomas Hain, 2017. DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS. Available at: http://sigport.org/1677.
Rosanna Milner, Thomas Hain. (2017). "DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS." Web.
1. Rosanna Milner, Thomas Hain. DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1677

SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS


A novel speaker segmentation approach based on deep neural network is proposed and investigated. This approach uses deep speaker vectors (d-vectors) to represent speaker characteristics and to find speaker change points. The d-vector is a kind of frame-level speaker recognition feature, whose discriminative training process corresponds to the goal of discriminating a speaker change point from a single speaker speech segment in a short time window.

Paper Details

Authors:
Submitted On:
28 February 2017 - 4:11am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS

(25)

Subscribe

[1] , "SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1490. Accessed: Nov. 21, 2019.
@article{1490-17,
url = {http://sigport.org/1490},
author = { },
publisher = {IEEE SigPort},
title = {SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS},
year = {2017} }
TY - EJOUR
T1 - SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS
AU -
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1490
ER -
. (2017). SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS. IEEE SigPort. http://sigport.org/1490
, 2017. SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS. Available at: http://sigport.org/1490.
. (2017). "SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS." Web.
1. . SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1490

EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION


The universal speech attributes for speaker verification (SV)
are addressed in this paper. The aim of this work is to
exploit fundamental characteristics across different speakers
within the deep neural network (DNN)/i-vector framework.
The manner and place of articulation form the fundamental
speech attribute unit inventory, and new attribute units for
acoustic modelling are generated by a two-step automatic
clustering method in this paper. The DNN based on
universal attribute units is used to generate posterior

Paper Details

Authors:
Sheng Zhang, Wu Guo, Guoping Hu
Submitted On:
27 February 2017 - 8:54pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2017_shengzhang_v2.pdf

(312)

Subscribe

[1] Sheng Zhang, Wu Guo, Guoping Hu, "EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1462. Accessed: Nov. 21, 2019.
@article{1462-17,
url = {http://sigport.org/1462},
author = {Sheng Zhang; Wu Guo; Guoping Hu },
publisher = {IEEE SigPort},
title = {EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION},
year = {2017} }
TY - EJOUR
T1 - EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION
AU - Sheng Zhang; Wu Guo; Guoping Hu
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1462
ER -
Sheng Zhang, Wu Guo, Guoping Hu. (2017). EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION. IEEE SigPort. http://sigport.org/1462
Sheng Zhang, Wu Guo, Guoping Hu, 2017. EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION. Available at: http://sigport.org/1462.
Sheng Zhang, Wu Guo, Guoping Hu. (2017). "EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION." Web.
1. Sheng Zhang, Wu Guo, Guoping Hu. EXPLORING UNIVERSAL SPEECH ATTRIBUTES FOR SPEAKER VERIFICATION [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1462

Senone I-Vectors for Robust Speaker Verification

Paper Details

Authors:
Submitted On:
15 October 2016 - 7:50am
Short Link:
Type:
Event:
Presenter's Name:

Document Files

ISCSLP16_SenoneIvector.pdf

(367)

Subscribe

[1] , "Senone I-Vectors for Robust Speaker Verification", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1219. Accessed: Nov. 21, 2019.
@article{1219-16,
url = {http://sigport.org/1219},
author = { },
publisher = {IEEE SigPort},
title = {Senone I-Vectors for Robust Speaker Verification},
year = {2016} }
TY - EJOUR
T1 - Senone I-Vectors for Robust Speaker Verification
AU -
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1219
ER -
. (2016). Senone I-Vectors for Robust Speaker Verification. IEEE SigPort. http://sigport.org/1219
, 2016. Senone I-Vectors for Robust Speaker Verification. Available at: http://sigport.org/1219.
. (2016). "Senone I-Vectors for Robust Speaker Verification." Web.
1. . Senone I-Vectors for Robust Speaker Verification [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1219

Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences


The widely adopted i-vector performances well in text-independent speaker verification with long speech duration. How to integrate the state-of-the-art i-vector framework into the text-prompted speaker verification is addressed in this paper. To take advantage of the lexical information and enhance the performance for speaker verification with random digit sequences, this paper proposes to extract a set of digit-dependent local i-vectors from the utterance instead of extracting a single i-vector. The digit-dependent local i-vector is considered

Paper Details

Authors:
Peixin Chen, Wu Guo, Guoping Hu
Submitted On:
14 October 2016 - 10:24pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCSLP2016_PeixinChen.pdf

(322)

Subscribe

[1] Peixin Chen, Wu Guo, Guoping Hu, "Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1213. Accessed: Nov. 21, 2019.
@article{1213-16,
url = {http://sigport.org/1213},
author = {Peixin Chen; Wu Guo; Guoping Hu },
publisher = {IEEE SigPort},
title = {Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences},
year = {2016} }
TY - EJOUR
T1 - Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences
AU - Peixin Chen; Wu Guo; Guoping Hu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1213
ER -
Peixin Chen, Wu Guo, Guoping Hu. (2016). Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences. IEEE SigPort. http://sigport.org/1213
Peixin Chen, Wu Guo, Guoping Hu, 2016. Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences. Available at: http://sigport.org/1213.
Peixin Chen, Wu Guo, Guoping Hu. (2016). "Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences." Web.
1. Peixin Chen, Wu Guo, Guoping Hu. Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1213

Pages