Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction

Abstract: 

Inspired by human hearing perception, we propose a twostage multi-resolution end-to-end model for singing melody extraction in this paper. The convolutional neural network (CNN) is the core of the proposed model to generate multiresolution representations. The 1-D and 2-D multi-resolution analysis on waveform and spectrogram-like graph are successively carried out by using 1-D and 2-D CNN kernels of different lengths and sizes. The 1-D CNNs with kernels of different lengths produce multi-resolution spectrogram-like graphs without suffering from the trade-off between spectral and temporal resolutions. The 2-D CNNs with kernels of different sizes extract features from spectro-temporal envelopes of different scales. Experiment results show the proposed model outperforms three compared systems in three out of five public databases.

up
0 users have voted:

Paper Details

Authors:
Bo-Jun Li, Tai-Shih Chi
Submitted On:
9 May 2019 - 1:00pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Ming-Tso Chen
Paper Code:
AASP-P16.9
Document Year:
2019
Cite

Document Files

ICASSP2019_MINGTSO.pdf

(94)

Subscribe

[1] Bo-Jun Li, Tai-Shih Chi, "CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4223. Accessed: Aug. 07, 2020.
@article{4223-19,
url = {http://sigport.org/4223},
author = {Bo-Jun Li; Tai-Shih Chi },
publisher = {IEEE SigPort},
title = {CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction},
year = {2019} }
TY - EJOUR
T1 - CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction
AU - Bo-Jun Li; Tai-Shih Chi
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4223
ER -
Bo-Jun Li, Tai-Shih Chi. (2019). CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction. IEEE SigPort. http://sigport.org/4223
Bo-Jun Li, Tai-Shih Chi, 2019. CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction. Available at: http://sigport.org/4223.
Bo-Jun Li, Tai-Shih Chi. (2019). "CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction." Web.
1. Bo-Jun Li, Tai-Shih Chi. CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4223