Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING

Abstract: 

Spoken content processing (such as retrieval and browsing) is maturing, but the singing content is still almost completely left out. Songs are human voice carrying plenty of semantic information just as speech, and may be considered as a special type of speech with highly flexible prosody. The various problems in song audio, for example the significantly changing phone duration over highly flexible pitch contours, make the recognition of lyrics from song audio much more difficult. This paper reports an initial attempt towards this goal. We collected music-removed version of English songs directly from commercial singing content. The best results were obtained by TDNN-BLSTM with data augmentation with 3-fold speed perturbation plus some special approaches. The WER achieved (73.90%) was significantly lower than the baseline (96.21%), but still relatively high.

up
0 users have voted:

Paper Details

Authors:
Che-Ping Tsai, Yi-Lin Tuan, Lin-shan Lee
Submitted On:
15 April 2018 - 12:49am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Che-Ping Tsai, Yi-Lin Tuan
Paper Code:
SP-P16
Document Year:
2018
Cite

Document Files

poster_v4.pdf

(141)

Subscribe

[1] Che-Ping Tsai, Yi-Lin Tuan, Lin-shan Lee, "TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2878. Accessed: Sep. 20, 2019.
@article{2878-18,
url = {http://sigport.org/2878},
author = {Che-Ping Tsai; Yi-Lin Tuan; Lin-shan Lee },
publisher = {IEEE SigPort},
title = {TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING},
year = {2018} }
TY - EJOUR
T1 - TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING
AU - Che-Ping Tsai; Yi-Lin Tuan; Lin-shan Lee
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2878
ER -
Che-Ping Tsai, Yi-Lin Tuan, Lin-shan Lee. (2018). TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING. IEEE SigPort. http://sigport.org/2878
Che-Ping Tsai, Yi-Lin Tuan, Lin-shan Lee, 2018. TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING. Available at: http://sigport.org/2878.
Che-Ping Tsai, Yi-Lin Tuan, Lin-shan Lee. (2018). "TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING." Web.
1. Che-Ping Tsai, Yi-Lin Tuan, Lin-shan Lee. TRANSCRIBING LYRICS FROM COMMERCIAL SONG AUDIO: THE FIRST STEP TOWARDS SINGING CONTENT PROCESSING [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2878