Sorry, you need to enable JavaScript to visit this website.

In this paper we propose a Deep Autoencoder Mixture Clustering(DAMIC) algorithm based on a mixture of deep autoencoders whereeach cluster is represented by an autoencoder. A clustering networktransforms the data into another space and then selects one of theclusters. Next, the autoencoder associated with this cluster is usedto reconstruct the data-point. The clustering algorithm jointly learnsthe nonlinear data representation and the set of autoencoders. Theoptimal clustering is found by minimizing the reconstruction loss ofthe mixture of autoencoder network.


We present a novel variational generative adversarial network (VGAN) based on Wasserstein loss to learn a latent representation
from a face image that is invariant to identity but preserves head-pose information. This facilitates synthesis of a realistic face
image with the same head pose as a given input image, but with a different identity. One application of this network is in
privacy-sensitive scenarios; after identity replacement in an image, utility, such as head pose, can still


An omnidirectional image (ODI) enables viewers to look in every direction from a fixed point through a head-mounted display providing an immersive experience compared to that of a standard image. Designing immersive virtual reality systems with ODIs is challenging as they require high resolution content. In this paper, we study super-resolution for ODIs and propose an improved generative adversarial network based model which is optimized to handle the artifacts obtained in the spherical observational space.


The growing use of virtual autonomous agents in applications like games and entertainment demands better control policies for natural-looking movements and actions. Unlike the conventional approach of hard-coding motion routines, we propose a deep learning method for obtaining control policies by directly mimicking raw video demonstrations. Previous methods in this domain rely on extracting low-dimensional features from expert videos followed by a separate hand-crafted reward estimation step.


A novel single-image rain removal method is proposed based on multi-scale cascading image generation (MSCG). In particular, the proposed method consists of an encoder extracting multi-scale features from images and a decoder generating de-rained images with a cascading mechanism. The encoder ensembles the convolution neural networks using the kernels with different sizes, and integrates their outputs across different scales.


Convolutional neural networks (CNN) have shown state-of-the-art results for low-level computer vision problems such as stereo and monocular disparity estimations, but still, have much room to further improve their performance in terms of accuracy, numbers of parameters, etc. Recent works have uncovered the advantages of using an unsupervised scheme to train CNN’s to estimate monocular disparity, where only the relatively-easy-to-obtain stereo images are needed for training.


Portrait segmentation is becoming a hot topic nowadays.
In this paper we propose a novel framework to cope with
the high precision requirements that portrait segmentation
demands on boundary area by deep refinement of the
portrait matting. Our approach introduces three novel
techniques. First, a trimap is proposed by fusing information
coming from two well-known techniques for image
segmentation, i.e., Mask R-CNN and DensePose. Second,
an alpha matting algorithm runs over the previous trimap


Even though zero padding is usually a staple in convolutional
neural networks to maintain the output size, it is highly suspicious
because it significantly alters the input distribution
around border region. To mitigate this problem, in this paper,
we propose a new padding technique termed as distribution
padding. The goal of the method is to approximately maintain
the statistics of the input border regions. We introduce
two different ways to achieve our goal. In both approaches,
the padded values are derived from the means of the border


In dynamic state-space models, the state can be estimated through recursive computation of the posterior distribution of the state given all measurements. In scenarios where active sensing/querying is possible, a hard decision is made when the state posterior achieves a pre-set confidence threshold. This mandate to meet a hard threshold may sometimes unnecessarily require more queries. In application domains where sensing/querying cost is of concern, some potential accuracy may be sacrificed for greater gains in sensing cost.