Sorry, you need to enable JavaScript to visit this website.

Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. While many studies have explored various training configurations and architectures for GANs, the problem of inverting the generator of GANs has been inadequately investigated. We train a ResNet architecture to map given faces to latent vectors that can be used to generate faces nearly identical to the target. We use a perceptual loss to embed face details in the recovered latent vector while maintaining visual quality using a pixel loss.


We present a novel video stabilization algorithm (LSstab) that removes unwanted motions in real-time. LSstab is based on a novel least squares formulation of the smoothing cost function to alleviate the undesirable camera jitter. A recursive least square solver is derived to minimize the smoothing cost function with an O(N) computation complexity. LSstab is evaluated using a suite of publicly available videos against the state of the art video stabilization methods. Results show LSstab reaches comparable or better performance, achieving real-time processing speed when a GPU is used.


Attention mechanisms, which enable a neural network to accurately focus on all the relevant elements of the input, have become an essential component to improve the performance of deep neural networks. There are mainly two attention mechanisms widely used in computer vision studies, spatial attention and channel attention, which aim to capture the pixel-level pairwise relationship and channel dependency, respectively. Although fusing them together may achieve better performance than their individual implementations, it will inevitably increase the computational overhead.


Prior work on image compression has focused on optimizing models to achieve better reconstruction at lower bit rates. These approaches are focused on creating sophisticated architectures that enhance encoder or decoder performance. In some cases, there is the desire to jointly optimize both along with a designed form of entropy encoding. In some instances, these approaches result in the creation of many redundant components, which may or may not be useful.


In this paper, we concentrate on the super-resolution (SR) of compressed screen content video, in an effort to address the real-world challenges by considering the underlying characteristics of screen content. Firstly, we propose a new dataset for the SR of screen content video with different distortion levels. Meanwhile, we design an efficient SR structure that could capture the characteristics of compressed screen content video and manipulate the inner-connections in consecutive compressed low-resolution frames, facilitating the high-quality recovery of the high-resolution counter-part.