A COMPLETE END-TO-END SPEAKER VERIFICATION SYSTEM USING DEEP NEURAL NETWORKS: FROM RAW SIGNALS TO VERIFICATION RESULT

End-to-end systems using deep neural networks have been widely studied in the field of speaker verification. Raw audio signal processing has also been widely studied in the fields of automatic music tagging and speech recognition. However, as far as we know, end-to-end systems using raw audio signals have not been explored in speaker verification. In this paper, a complete end-to-end speaker verification system is proposed, which inputs raw audio signals and outputs the verification results. A pre-processing layer and the embedded speaker feature extraction models were mainly investigated. The pro- posed pre-emphasis layer was combined with a strided convolution layer for pre-processing at the first two hidden layers. In addition, speaker feature extraction models using convolutional layer and long short-term memory are proposed to be embedded in the proposed end-to-end system.

ICASSP2018_RAE2E_poster_v3.pdf

ICASSP2018_RAE2E_poster_v3.pdf (758)

Thumbs Up

CITE

Documents

Poster

A COMPLETE END-TO-END SPEAKER VERIFICATION SYSTEM USING DEEP NEURAL NETWORKS: FROM RAW SIGNALS TO VERIFICATION RESULT

ICASSP2018_RAE2E_poster_v3.pdf

QUESTIONS?