Audio Processing Systems

HIERARCHY-AWARE LOSS FUNCTION ON A TREE STRUCTURED LABEL SPACE FOR AUDIO EVENT DETECTION

icassp_arindam_hierarchy_ver2.pptx

aed_hal (560)

Categories:: Audio Processing Systems

27 Views

Modeling nonlinear audio effects with end-to-end deep neural networks

Read more about Modeling nonlinear audio effects with end-to-end deep neural networks
Log in to post comments

Audio processors whose parameters are modified periodically
over time are often referred as time-varying or modulation based
audio effects. Most existing methods for modeling these type of
effect units are often optimized to a very specific circuit and cannot
be efficiently generalized to other time-varying effects. Based on
convolutional and recurrent neural networks, we propose a deep
learning architecture for generic black-box modeling of audio processors
with long-term memory. We explore the capabilities of

ICASSP___Presentation_Martinez_Ramirez.pdf

ICASSP___Presentation_Martinez_Ramirez.pdf (595)

Categories:: Music Signal Processing
Audio Processing Systems
Applications in Music and Audio Processing (MLR-MUSI)

53 Views

CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction

Read more about CNN Based Two-Stage Multi-Resolution End-to-End Model for Singing Melody Extraction
Log in to post comments

Inspired by human hearing perception, we propose a twostage multi-resolution end-to-end model for singing melody extraction in this paper. The convolutional neural network (CNN) is the core of the proposed model to generate multiresolution representations. The 1-D and 2-D multi-resolution analysis on waveform and spectrogram-like graph are successively carried out by using 1-D and 2-D CNN kernels of different lengths and sizes.

ICASSP2019_MINGTSO.pdf

ICASSP2019_MINGTSO.pdf (469)

Categories:: Audio Processing Systems
Music Signal Processing

32 Views

Contextual Speech Recognition with Difficult Negative Training Examples

Read more about Contextual Speech Recognition with Difficult Negative Training Examples
Log in to post comments

poster.pdf

poster.pdf (565)

Categories:: Audio Processing Systems

22 Views

Exploring CTC-network derived features with conventional hybrid system

Read more about Exploring CTC-network derived features with conventional hybrid system
Log in to post comments

icassp2018.pdf

icassp2018.pdf (809)

Categories:: Audio Processing Systems

118 Views

Learning Environmental Sounds with End-to-end Convolutional Neural Network

Read more about Learning Environmental Sounds with End-to-end Convolutional Neural Network
Log in to post comments

Environmental sound classification (ESC) is usually conducted based on handcrafted features such as the log-mel feature. Meanwhile, end-to-end classification systems perform feature extraction jointly with classification and have achieved success particularly in image classification. In the same manner, if environmental sounds could be directly learned from the raw waveforms, we would be able to extract a new feature effective for classification that could not have been designed by humans, and this new feature could improve the classification performance.

poster1.pdf

Poster (1165)

Categories:: Audio Processing Systems

49 Views

Cluster-Based Senone Selection for the Efficient Calculation of Deep Neural Network Acoustic Models

This is oral presentation at ISCSLP, for more information, please refer to paper:

Jun-Hua Liu, Zhen-Hua Ling, Si Wei, Guo-Ping Hu, Li-Rong Dai, "Cluster-Based Senone Selection for the Efficient Calculation of Deep Neural Network Acoustic Models", ISCSLP, 2016.

20161001_dnn_cluster_v2.pptx

20161001_dnn_cluster_v2.pptx (789)

Categories:: Audio Processing Systems

13 Views

Acoustic detection and localization of impulsive events in urban environments

Read more about Acoustic detection and localization of impulsive events in urban environments
Log in to post comments

SPM Student submission_Tahir.zip

SPM Student submission_Tahir.zip (77)

Categories:: Audio Processing Systems
DSP algorithm implementation in hardware and software

23 Views

LEARNING COMPACT STRUCTURAL REPRESENTATIONS FOR AUDIO EVENTS USING REGRESSOR BANKS

Read more about LEARNING COMPACT STRUCTURAL REPRESENTATIONS FOR AUDIO EVENTS USING REGRESSOR BANKS
Log in to post comments

We introduce a new learned descriptor for audio signals which is efficient for event representation. The entries of the descriptor are produced by evaluating a set of regressors on the input signal. The regressors are class-specific and trained using the random regression forests framework. Given an input signal, each regressor estimates the onset and offset positions of the target event. The estimation confidence scores output by a regressor are then used to quantify how the target event aligns with the temporal structure of the corresponding category.