Sorry, you need to enable JavaScript to visit this website.

We propose a talking face video compression framework by implicitly transforming the temporal evolution into compact feature representation. More specifically, the temporal evolution of faces, which is complex, non-linear and difficult to extrapolate, is modelled in an end-to-end inference framework based upon very compact features. This enables the high-quality rendering of the face videos, which benefits from the learning of dense motion map with compact feature representation.

Categories:
90 Views

As a new signal processing technology, compressed sensing (CS) has been showed to be a promising solution for compressing cipher images. However, the previous CS-based schemes are unsatisfactory in terms of ratio-distortion (R-D) performance. In order to solve this problem, an image encryption-then-compression (ETC) scheme by using semi-tensor product CS (STP-CS) and pre-mapping is proposed in this paper. In the proposed scheme, the original image is encrypted by using the scrambling operation. After image encryption, the cipher image is compressed through three steps.

Categories:
32 Views

To achieve high efficiency of remote pathology image browsing in telemedicine, efficient image compression coding is required. In this work, we establish a visibility threshold (VT) model, which considers multi-resolution and different visual qualities jointly. Based on this model, we propose an image coding method under the JPEG2000 standard for the whole-slide pathology images (WSIs), which operates adaptively according to the required resolutions and visual qualities.

Categories:
24 Views

Recently many efforts have been devoted to learning non-linear predictions from neighboring samples with deep neural networks. However, existing methods mainly generate predictions with local reference samples, regardless of nonlocal self-similarity.

Categories:
23 Views

Motion Capture (MoCap) data is one type of fundamental asset for the digital entertainment. The progressively increasing 3D applications make MoCap data compression unprecedentedly important. In this paper, we propose an end-to-end attribute-decomposable motion compression network using the AutoEncoder architecture. Specifically, the algorithm consists of an LSTM-based encoder-decoder for compression and decompression. The encoder module decomposes human motion into multiple uncorrelated semantic attributes, including action content, arm space, and motion mirror.

Categories:
147 Views

Keyframe insertion is a solution for fast channel switching and packet-loss repair in low-delay live streaming.

This work lists the requirements of keyframe insertion in three generations of video coding standards (H.264/AVC, H.265/HEVC, and H.266/VVC), and analyzes the quality impact.

Categories:
67 Views

Deep learning methods have been achieving good results at the in-loop filtering stage in Versatile Video Coding(VVC). The Multi-Frame In-Loop Filter of HEVC (MIF) algorithm is one of the networks that effectively utilizes the multiple reference frames to enhance the quality of reconstruction. However, it has the disadvantages of low efficiency of the reference frame selection, and its quality enhancement network does not fully utilize the inter-frame correlations of the video sequence and has large redundancy.

Categories:
48 Views

Geometric prediction merge mode (GPM) is a new tool introduced in the inter-prediction of Versatile Video Coding (VVC), which uses non-rectangular block partitions for coding unit (CU) partition to improve coding performance. To address the problem of large computational redundancy in the geometric prediction merge mode with motion vector refinement (GPM with MMVD), in this paper, a new decision algorithm is proposed based on CU gradient.

Categories:
78 Views

Pages