Sorry, you need to enable JavaScript to visit this website.

Audio Analysis and Synthesis

REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS


Being able to predict whether a song can be a hit has important applications in the music industry. Although it is true that the popularity of a song can be greatly affected by external factors such as social and commercial influences, to which degree audio features computed from musical signals (whom we regard as internal factors) can predict song popularity is an interesting research question on its own.

Paper Details

Authors:
Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen
Submitted On:
3 March 2017 - 12:59am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp2017.pdf

(82 downloads)

Keywords

Subscribe

[1] Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen, "REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1600. Accessed: Jun. 29, 2017.
@article{1600-17,
url = {http://sigport.org/1600},
author = {Li-Chia Yang; Szu-Yu Chou; Jen-Yu Liu; Yi-Hsuan Yang; Yi-An Chen },
publisher = {IEEE SigPort},
title = {REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS},
year = {2017} }
TY - EJOUR
T1 - REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS
AU - Li-Chia Yang; Szu-Yu Chou; Jen-Yu Liu; Yi-Hsuan Yang; Yi-An Chen
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1600
ER -
Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen. (2017). REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS. IEEE SigPort. http://sigport.org/1600
Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen, 2017. REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS. Available at: http://sigport.org/1600.
Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen. (2017). "REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS." Web.
1. Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen. REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1600

Global Variance in Speech Synthesis with Linear Dynamical Models

Paper Details

Authors:
Vassilis Tsiaras, Ranniery Maia, Vassilis Diakoloukas, Yannis Stylianou, Vassilis Digalakis
Submitted On:
11 March 2017 - 8:48pm
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

GV_LDM.pdf

(25 downloads)

Keywords

Subscribe

[1] Vassilis Tsiaras, Ranniery Maia, Vassilis Diakoloukas, Yannis Stylianou, Vassilis Digalakis, "Global Variance in Speech Synthesis with Linear Dynamical Models", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1597. Accessed: Jun. 29, 2017.
@article{1597-17,
url = {http://sigport.org/1597},
author = {Vassilis Tsiaras; Ranniery Maia; Vassilis Diakoloukas; Yannis Stylianou; Vassilis Digalakis },
publisher = {IEEE SigPort},
title = {Global Variance in Speech Synthesis with Linear Dynamical Models},
year = {2017} }
TY - EJOUR
T1 - Global Variance in Speech Synthesis with Linear Dynamical Models
AU - Vassilis Tsiaras; Ranniery Maia; Vassilis Diakoloukas; Yannis Stylianou; Vassilis Digalakis
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1597
ER -
Vassilis Tsiaras, Ranniery Maia, Vassilis Diakoloukas, Yannis Stylianou, Vassilis Digalakis. (2017). Global Variance in Speech Synthesis with Linear Dynamical Models. IEEE SigPort. http://sigport.org/1597
Vassilis Tsiaras, Ranniery Maia, Vassilis Diakoloukas, Yannis Stylianou, Vassilis Digalakis, 2017. Global Variance in Speech Synthesis with Linear Dynamical Models. Available at: http://sigport.org/1597.
Vassilis Tsiaras, Ranniery Maia, Vassilis Diakoloukas, Yannis Stylianou, Vassilis Digalakis. (2017). "Global Variance in Speech Synthesis with Linear Dynamical Models." Web.
1. Vassilis Tsiaras, Ranniery Maia, Vassilis Diakoloukas, Yannis Stylianou, Vassilis Digalakis. Global Variance in Speech Synthesis with Linear Dynamical Models [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1597

poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK

Paper Details

Authors:
Submitted On:
24 March 2016 - 10:48am
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK.pdf

(141 downloads)

Keywords

Subscribe

[1] , "poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1026. Accessed: Jun. 29, 2017.
@article{1026-16,
url = {http://sigport.org/1026},
author = { },
publisher = {IEEE SigPort},
title = {poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK},
year = {2016} }
TY - EJOUR
T1 - poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK
AU -
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1026
ER -
. (2016). poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK. IEEE SigPort. http://sigport.org/1026
, 2016. poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK. Available at: http://sigport.org/1026.
. (2016). "poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK." Web.
1. . poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1026

poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK

Paper Details

Authors:
Submitted On:
24 March 2016 - 10:48am
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK.pdf

(132 downloads)

Keywords

Subscribe

[1] , "poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1025. Accessed: Jun. 29, 2017.
@article{1025-16,
url = {http://sigport.org/1025},
author = { },
publisher = {IEEE SigPort},
title = {poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK},
year = {2016} }
TY - EJOUR
T1 - poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK
AU -
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1025
ER -
. (2016). poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK. IEEE SigPort. http://sigport.org/1025
, 2016. poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK. Available at: http://sigport.org/1025.
. (2016). "poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK." Web.
1. . poster_STEGANALYSIS OF AAC USINGCALIBRATED MARKOV MODEL OF ADJACENT CODEBOOK [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1025

Lecture ICASSP 2016 Pierre Laffitte


This presentation introduces a Deep Learning model that performs classification of the Audio Scene in the subway environment. The final goal is to detect Screams and Shouts for surveillance purposes. The model is a combination of Deep Belief Network and Deep Neural Network, (generatively pre-trained within the DBN framework and fine-tuned discriminatively within the DNN framework), and is trained on a novel database of pseudo-real signals collected in the Paris metro.

Paper Details

Authors:
Pierre Laffitte, David Sodoyer, Laurent Girin, Charles Tatkeu
Submitted On:
23 March 2016 - 10:01am
Short Link:
Type:
Event:
Paper Code:

Document Files

ICASSP Lecture.pdf

(152 downloads)

Keywords

Subscribe

[1] Pierre Laffitte, David Sodoyer, Laurent Girin, Charles Tatkeu, "Lecture ICASSP 2016 Pierre Laffitte", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/991. Accessed: Jun. 29, 2017.
@article{991-16,
url = {http://sigport.org/991},
author = {Pierre Laffitte; David Sodoyer; Laurent Girin; Charles Tatkeu },
publisher = {IEEE SigPort},
title = {Lecture ICASSP 2016 Pierre Laffitte},
year = {2016} }
TY - EJOUR
T1 - Lecture ICASSP 2016 Pierre Laffitte
AU - Pierre Laffitte; David Sodoyer; Laurent Girin; Charles Tatkeu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/991
ER -
Pierre Laffitte, David Sodoyer, Laurent Girin, Charles Tatkeu. (2016). Lecture ICASSP 2016 Pierre Laffitte. IEEE SigPort. http://sigport.org/991
Pierre Laffitte, David Sodoyer, Laurent Girin, Charles Tatkeu, 2016. Lecture ICASSP 2016 Pierre Laffitte. Available at: http://sigport.org/991.
Pierre Laffitte, David Sodoyer, Laurent Girin, Charles Tatkeu. (2016). "Lecture ICASSP 2016 Pierre Laffitte." Web.
1. Pierre Laffitte, David Sodoyer, Laurent Girin, Charles Tatkeu. Lecture ICASSP 2016 Pierre Laffitte [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/991

A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative


A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative

Abstract—In most applications of sinusoidal models for speech
signal, an amplitude spectral envelope is necessary. This envelope
is not only assumed to fit the vocal tract filter response as
accurately as possible, but it should also exhibit slow varying
shapes across time. Indeed, time irregularities can generate
artifacts in signal manipulations or increase improperly the
features variance used in statistical models. In this letter, a
simple technique is suggested to improve this time regularity.

Paper Details

Authors:
Submitted On:
21 March 2016 - 11:37am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

ICASSPMFAPoster.pdf

(167 downloads)

Keywords

Subscribe

[1] , "A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/715. Accessed: Jun. 29, 2017.
@article{715-16,
url = {http://sigport.org/715},
author = { },
publisher = {IEEE SigPort},
title = {A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative},
year = {2016} }
TY - EJOUR
T1 - A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative
AU -
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/715
ER -
. (2016). A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative. IEEE SigPort. http://sigport.org/715
, 2016. A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative. Available at: http://sigport.org/715.
. (2016). "A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative." Web.
1. . A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/715

Guided Signal Reconstruction with Application to Image Magnification


Reconstruction Set

We propose signal reconstruction algorithms which utilize a guiding subspace that represents desired properties of reconstructed signals. Optimal reconstructed signals are shown to belong to a convex bounded set, called the ``reconstruction'' set. Iterative reconstruction algorithms, based on conjugate gradient methods, are developed to approximate optimal reconstructions with low memory and computational costs. Effectiveness of the proposed method is demonstrated with an application to image magnification.

Paper Details

Authors:
Akshay Gadde, Hassan Mansour, Dong Tian
Submitted On:
23 February 2016 - 1:44pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

globalsip-15-slides-v2.pdf

(239 downloads)

Keywords

Subscribe

[1] Akshay Gadde, Hassan Mansour, Dong Tian, "Guided Signal Reconstruction with Application to Image Magnification", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/384. Accessed: Jun. 29, 2017.
@article{384-15,
url = {http://sigport.org/384},
author = {Akshay Gadde; Hassan Mansour; Dong Tian },
publisher = {IEEE SigPort},
title = {Guided Signal Reconstruction with Application to Image Magnification},
year = {2015} }
TY - EJOUR
T1 - Guided Signal Reconstruction with Application to Image Magnification
AU - Akshay Gadde; Hassan Mansour; Dong Tian
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/384
ER -
Akshay Gadde, Hassan Mansour, Dong Tian. (2015). Guided Signal Reconstruction with Application to Image Magnification. IEEE SigPort. http://sigport.org/384
Akshay Gadde, Hassan Mansour, Dong Tian, 2015. Guided Signal Reconstruction with Application to Image Magnification. Available at: http://sigport.org/384.
Akshay Gadde, Hassan Mansour, Dong Tian. (2015). "Guided Signal Reconstruction with Application to Image Magnification." Web.
1. Akshay Gadde, Hassan Mansour, Dong Tian. Guided Signal Reconstruction with Application to Image Magnification [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/384

Natural Sound Rendering for Headphones: Integration of signal processing techniques


With the strong growth of assistive and personal listening devices, natural sound rendering over headphones is becoming a necessity for prolonged listening in multimedia and virtual reality applications. The aim of natural sound rendering is to naturally recreate the sound scenes with the spatial and timbral quality as natural as possible, so as to achieve a truly immersive listening experience. However, rendering natural sound over headphones encounters many challenges. This tutorial article presents signal processing techniques to tackle these challenges to assist human listening.

Paper Details

Authors:
Kaushik Sunder, Ee-Leng Tan
Submitted On:
23 February 2016 - 1:44pm
Short Link:
Type:

Document Files

SPM2015manuscript-Natural Sound Rendering for Headphones.pdf

(344 downloads)

Keywords

Subscribe

[1] Kaushik Sunder, Ee-Leng Tan, "Natural Sound Rendering for Headphones: Integration of signal processing techniques", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/166. Accessed: Jun. 29, 2017.
@article{166-15,
url = {http://sigport.org/166},
author = {Kaushik Sunder; Ee-Leng Tan },
publisher = {IEEE SigPort},
title = {Natural Sound Rendering for Headphones: Integration of signal processing techniques},
year = {2015} }
TY - EJOUR
T1 - Natural Sound Rendering for Headphones: Integration of signal processing techniques
AU - Kaushik Sunder; Ee-Leng Tan
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/166
ER -
Kaushik Sunder, Ee-Leng Tan. (2015). Natural Sound Rendering for Headphones: Integration of signal processing techniques. IEEE SigPort. http://sigport.org/166
Kaushik Sunder, Ee-Leng Tan, 2015. Natural Sound Rendering for Headphones: Integration of signal processing techniques. Available at: http://sigport.org/166.
Kaushik Sunder, Ee-Leng Tan. (2015). "Natural Sound Rendering for Headphones: Integration of signal processing techniques." Web.
1. Kaushik Sunder, Ee-Leng Tan. Natural Sound Rendering for Headphones: Integration of signal processing techniques [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/166

Linear estimation based primary-ambient extraction for stereo audio signals (slides)


Audio signals for moving pictures and video games are often linear combinations of primary and ambient components. In spatial audio analysis-synthesis, these mixed signals are usually decomposed into primary and ambient components to facilitate flexible spatial rendering and enhancement. Existing approaches such as principal component analysis (PCA) and least squares (LS) are widely used to perform this decomposition from stereo signals.

Paper Details

Authors:
Ee Leng Tan
Submitted On:
23 February 2016 - 1:43pm
Short Link:
Type:

Document Files

ASLP14slides_Linear estimation based primary-ambient extraction for stereo audio signals-short.pdf

(287 downloads)

Keywords

Subscribe

[1] Ee Leng Tan, "Linear estimation based primary-ambient extraction for stereo audio signals (slides)", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/159. Accessed: Jun. 29, 2017.
@article{159-15,
url = {http://sigport.org/159},
author = {Ee Leng Tan },
publisher = {IEEE SigPort},
title = {Linear estimation based primary-ambient extraction for stereo audio signals (slides)},
year = {2015} }
TY - EJOUR
T1 - Linear estimation based primary-ambient extraction for stereo audio signals (slides)
AU - Ee Leng Tan
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/159
ER -
Ee Leng Tan. (2015). Linear estimation based primary-ambient extraction for stereo audio signals (slides). IEEE SigPort. http://sigport.org/159
Ee Leng Tan, 2015. Linear estimation based primary-ambient extraction for stereo audio signals (slides). Available at: http://sigport.org/159.
Ee Leng Tan. (2015). "Linear estimation based primary-ambient extraction for stereo audio signals (slides)." Web.
1. Ee Leng Tan. Linear estimation based primary-ambient extraction for stereo audio signals (slides) [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/159

Linear estimation based primary-ambient extraction for stereo audio signals


Linear estimation based primary-ambient extraction for stereo audio signals

Audio signals for moving pictures and video games are often linear combinations of primary and ambient components. In spatial audio analysis-synthesis, these mixed signals are usually decomposed into primary and ambient components to facilitate flexible spatial rendering and enhancement. Existing approaches such as principal component analysis (PCA) and least squares (LS) are widely used to perform this decomposition from stereo signals.

Paper Details

Authors:
Ee Leng Tan
Submitted On:
23 February 2016 - 1:43pm
Short Link:
Type:

Document Files

ASLP14manuscript_Linear estimation based primary-ambient extraction for stereo audio signals.pdf

(291 downloads)

Keywords

Subscribe

[1] Ee Leng Tan, "Linear estimation based primary-ambient extraction for stereo audio signals", IEEE SigPort, 2015. [Online]. Available: http://sigport.org/158. Accessed: Jun. 29, 2017.
@article{158-15,
url = {http://sigport.org/158},
author = {Ee Leng Tan },
publisher = {IEEE SigPort},
title = {Linear estimation based primary-ambient extraction for stereo audio signals},
year = {2015} }
TY - EJOUR
T1 - Linear estimation based primary-ambient extraction for stereo audio signals
AU - Ee Leng Tan
PY - 2015
PB - IEEE SigPort
UR - http://sigport.org/158
ER -
Ee Leng Tan. (2015). Linear estimation based primary-ambient extraction for stereo audio signals. IEEE SigPort. http://sigport.org/158
Ee Leng Tan, 2015. Linear estimation based primary-ambient extraction for stereo audio signals. Available at: http://sigport.org/158.
Ee Leng Tan. (2015). "Linear estimation based primary-ambient extraction for stereo audio signals." Web.
1. Ee Leng Tan. Linear estimation based primary-ambient extraction for stereo audio signals [Internet]. IEEE SigPort; 2015. Available from : http://sigport.org/158