- Read more about Deep CNN Sparse Coding Analysis
- Log in to post comments
Deep Convolutional Sparse Coding (D-CSC) is a framework reminiscent
of deep convolutional neural nets (DCNN), but by omitting the learning of the
dictionaries one can more transparently analyse the role of the
activation function and its ability to recover activation paths
through the layers. Papyan, Romano, and Elad conducted an analysis of
such an architecture \cite{2016arXiv160708194P}, demonstrated the
relationship with DCNNs and proved conditions under which a D-CSC is
guaranteed to recover activation paths. A technical innovation of
- Categories:
- Read more about MOTIFNET: A MOTIF-BASED GRAPH CONVOLUTIONAL NETWORK FOR DIRECTED GRAPHS
- Log in to post comments
Deep learning on graphs and in particular, graph convolutional neural networks, have recently attracted significant attention in the machine learning community. Many of such
MotifNet.pdf
- Categories:
- Read more about INVESTIGATING LABEL NOISE SENSITIVITY OF CONVOLUTIONAL NEURAL NETWORKS FOR FINE GRAINED AUDIO SIGNAL LABELLING
- Log in to post comments
We measure the effect of small amounts of systematic and
random label noise caused by slightly misaligned ground truth
labels in a fine grained audio signal labeling task. The task
we choose to demonstrate these effects on is also known as
framewise polyphonic transcription or note quantized multi-
f0 estimation, and transforms a monaural audio signal into a
sequence of note indicator labels. It will be shown that even
slight misalignments have clearly apparent effects, demonstrating a great sensitivity of convolutional neural networks
- Categories:
- Read more about VR IQA NET: Deep Virtual Reality Image Quality Assessment using Adversarial Learning
- Log in to post comments
In this paper, we propose a novel virtual reality image quality assessment (VR IQA) with adversarial learning for omnidirectional images. To take into account the characteristics of the omnidirectional image, we devise deep networks including novel quality score predictor and human perception guider. The proposed quality score predictor automatically predicts the quality score of distorted image using the latent spatial and position feature.
- Categories:
- Read more about TasNet: time-domain audio separation network for real-time, single-channel speech separation
- Log in to post comments
Robust speech processing in multi-talker environments requires effective speech separation. Recent deep learning systems have made significant progress toward solving this problem, yet it remains challenging particularly in real-time, short latency applications. Most methods attempt to construct a mask for each source in time-frequency representation of the mixture signal which is not necessarily an optimal representation for speech separation.
- Categories:
- Read more about Generative Adversarial Network and its Applications to Speech Signal and Natural Language Processing
- Log in to post comments
Generative adversarial network (GAN) is a new idea for training models, in which a generator and a discriminator compete against each other to improve the generation quality. Recently, GAN has shown amazing results in image generation, and a large amount and a wide variety of new ideas, techniques, and applications have been developed based on it. Although there are only few successful cases, GAN has great potential to be applied to text and speech generations to overcome limitations in the conventional methods.
- Categories:
- Read more about MODEL BASED DEEP LEARNING IN FREE BREATHING, UNGATED, CARDIAC MRI RECOVERY
- Log in to post comments
We introduce a model-based reconstruction
framework with deep learned (DL) and smoothness regularization
on manifolds (STORM) priors to recover free
breathing and ungated (FBU) cardiac MRI from highly undersampled
measurements. The DL priors enable us to exploit
the local correlations, while the STORM prior enables
us to make use of the extensive non-local similarities that are
subject dependent. We introduce a novel model-based formulation
that allows the seamless integration of deep learning
- Categories:
- Read more about AN UPPER BOUND ON THE REQUIRED SIZE OF A NEURAL NETWORK CLASSIFIER
- Log in to post comments
There is growing interest in understanding the impact of architectural parameters such as depth, width, and the type of
activation function on the performance of a neural network. We provide an upper-bound on the number of free parameters
a ReLU-type neural network needs to exactly fit the training data. Whether a net of this size generalizes to test data will
be governed by the fidelity of the training data and the applicability of the principle of Occam’s Razor. We introduce the
- Categories:
In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based end-to-end (TE2E) loss function. Unlike TE2E, the GE2E loss function updates the network in a way that emphasizes examples that are difficult to verify at each step of the training process. Additionally, the GE2E loss does not require an initial stage of example selection.
- Categories:
- Read more about A Random Matrix and Concentration Inequalities framework for Neural Networks Analysis
- Log in to post comments
Our article provides a theoretical analysis of the asymptotic performance of a regression or classification task performed by a simple random neural network. This result is obtained by leveraging a new framework at the crossroads between random matrix theory and the concentration of measure theory. This approach is of utmost interest for neural network analysis at large in that it naturally dismisses the difficulty induced by the non-linear activation functions, so long that these are Lipschitz functions.
- Categories: