Sorry, you need to enable JavaScript to visit this website.

The typical problem like insufficient training instances in time series classification task demands for novel deep neural architecture to warrant consistent and accurate performance. Deep Residual Network (ResNet) learns through H(x)=F(x)+x, where F(x) is a nonlinear function. We propose Blend-Res2Net that blends two different representation spaces: H^1 (x)=F(x)+Trans(x) and H^2 (x)=F(Trans(x))+x with the intention of learning over richer representation by capturing the temporal as well as the spectral signatures (Trans(∙) represents the transformation function).

Categories:
10 Views

In this paper, we propose a residual echo suppression method using a UNet neural network that directly maps the outputs of a linear acoustic echo canceler to the desired signal in the spectral domain. This system embeds a design parameter that allows a tunable tradeoff between the desired-signal distortion and residual echo suppression in double-talk scenarios. The system employs 136 thousand parameters, and requires 1.6 Giga floating-point operations per second and 10 Mega-bytes of memory.

Categories:
14 Views

Images captured nowadays are of varying dimensions with smartphones and DSLR’s allowing users to choose from a list of available image resolutions. It is therefore imperative for forensic algorithms such as resampling detection to scale well for images of varying dimensions. However, in our experiments we observed that many state-of-the-art forensic algorithms are sensitive to image size and their performance quickly degenerates when operated on images of diverse dimensions despite re-training them using multiple image sizes.

Categories:
22 Views

We propose a novel neural network-based adaptive image denoiser, dubbased as Neural AIDE. Unlike other neural network-based denoisers, which typically apply supervised training to learn a mapping from a noisy patch to a clean patch, we formulate to train a neural network to learn context- based affine mappings that get applied to each noisy pixel. Our formulation enables using SURE (Stein’s Unbiased Risk Estimator)-like estimated losses of those mappings as empirical risks to minimize.

Categories:
137 Views

We investigate the impacts of objective functions on the performance of deep-learning-based prostate magnetic resonance image segmentation. To this end, we first develop a baseline convolutional neural network (BCNN) for the prostate image segmentation, which consists of encoding, bridge, decoding, and classification modules. In the BCNN, we use 3D convolutional layers to consider volumetric information. Also, we adopt the residual feature forwarding and intermediate feature propagation techniques to make the BCNN reliably trainable for various objective functions.

Categories:
12 Views

Speechreading is a notoriously difficult task for humans to perform. In this paper we present an end-to-end model based on a convolutional neural network (CNN) for generating an intelligible acoustic speech signal from silent video frames of a speaking person. The proposed CNN generates sound features for each frame based on its neighboring frames. Waveforms are then synthesized from the learned speech features to produce intelligible speech.

Categories:
8 Views