Sorry, you need to enable JavaScript to visit this website.

Deep Neural Network (DNN) is a basic method used for the rare Acoustic Event Detection (AED) in synthesised audio. The structure of DNNs including Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN) for AED tasks has rather fewer hidden layers compared with computer vision systems. This paper tries to demonstrate that a DNN with more hidden layers does not necessarily guarantee a better performance in AED tasks.

Categories:
113 Views

Motivated by applications such as ordinal embedding and collaborative ranking, we formulate homogeneous quadratic feasibility as an unconstrained, non-convex minimization problem. Our work aims to understand the landscape (local minimizers and global minimizers) of the non-convex objective, which corresponds to hinge losses arising from quadratic constraints. Under certain assumptions, we give necessary conditions for non-global, local minimizers of our objective and additionally show that in two dimensions, every local minimizer is a global minimizer.

Categories:
37 Views

erformance of object classification using 3D automotive radar relies on accurate data association and multitarget tracking, which are greatly affected by data bias and proximity of objects to each other. A regularized fuzzy c-means (RFCM) algorithm is proposed herein to resolve the data association uncertainty problem that has shown to outperform the conventional FCM algorithm. The proposed method exploits results from the companion tracker to increase performance robustness. Simulation results using simulated and field data have proven the efficacy of the proposed method.

Categories:
12 Views

Dynamic functional connectivity has become a prominent approach for tracking the changes of macroscale statistical dependencies between regions in the brain. Effective parametrization of these statistical dependencies, referred to as brain states, is however still an open problem. We investigate different emission models in the hidden Markov model framework, each representing certain assumptions about dynamic changes in the brain.

Categories:
4 Views

In this paper, we propose a deep learning based algorithm to estimate the position of an user by utilizing reference signal received power (RSRP) and the location of base stations. To obtain reliable results in a real communication environment, parameters were measured using commercially available base stations and mobile phones within a LTE network. Since the structure of the measured data changes in accordance with the number of connected base stations, it is necessary to work on data uniformity processing before running the deep learning network.

Categories:
35 Views

Traditional NMF-based signal decomposition relies on the factorization of spectral data, which is typically computed by means of short-time frequency transform. In this paper we propose to relax the choice of a pre-fixed transform and learn a short-time orthogonal transform together with the factorization. To this end, we formulate a regularized optimization problem reminiscent of conventional NMF, yet with the transform as additional unknown parameters, and design a novel block-descent algorithm enabling to find stationary points of this objective function.

Categories:
3 Views

We present a new method to generate fake data in unknown classes in generative adversarial networks (GANs) framework. The generator in GANs is trained to generate somewhat similar to data in known classes but the different one by modelling noisy distribution on feature space of a classifier using proposed marginal denoising autoencoder. The generated data are treated as fake instances in unknown classes and given to the classifier to make it be robust to the real unknown classes.

Categories:
49 Views

Speech emotion recognition is important to understand users' intention in human-computer interaction. However, it is a challenging task partly because we cannot clearly know which feature and model are effective to distinguish emotions. Previous studies utilize convolutional neural network (CNN) directly on spectrograms to extract features, and bidirectional long short term memory (BLSTM) is the state-of-the-art model. However, there are two problems of CNN-BLSTM. Firstly, it doesn't utilize heuristic features based on priori knowledge.

Categories:
182 Views

We propose a benchmark curve that measures the inherent complexity of a detection problem. The benchmark curve is built using a sequence of simple detection methods based upon random projection. It is parameterized by the area above the receiver-operating characteristic curve of the detection method and its computational cost. It divides the plane into regions that can be used to characterize the computational and structural advantages of a given detection method. Numerical illustrations are provided.

Categories:
Views

Pages