Sorry, you need to enable JavaScript to visit this website.

To achieve high efficiency of remote pathology image browsing in telemedicine, efficient image compression coding is required. In this work, we establish a visibility threshold (VT) model, which considers multi-resolution and different visual qualities jointly. Based on this model, we propose an image coding method under the JPEG2000 standard for the whole-slide pathology images (WSIs), which operates adaptively according to the required resolutions and visual qualities.


Recently many efforts have been devoted to learning non-linear predictions from neighboring samples with deep neural networks. However, existing methods mainly generate predictions with local reference samples, regardless of nonlocal self-similarity.


Motion Capture (MoCap) data is one type of fundamental asset for the digital entertainment. The progressively increasing 3D applications make MoCap data compression unprecedentedly important. In this paper, we propose an end-to-end attribute-decomposable motion compression network using the AutoEncoder architecture. Specifically, the algorithm consists of an LSTM-based encoder-decoder for compression and decompression. The encoder module decomposes human motion into multiple uncorrelated semantic attributes, including action content, arm space, and motion mirror.


Keyframe insertion is a solution for fast channel switching and packet-loss repair in low-delay live streaming.

This work lists the requirements of keyframe insertion in three generations of video coding standards (H.264/AVC, H.265/HEVC, and H.266/VVC), and analyzes the quality impact.


Deep learning methods have been achieving good results at the in-loop filtering stage in Versatile Video Coding(VVC). The Multi-Frame In-Loop Filter of HEVC (MIF) algorithm is one of the networks that effectively utilizes the multiple reference frames to enhance the quality of reconstruction. However, it has the disadvantages of low efficiency of the reference frame selection, and its quality enhancement network does not fully utilize the inter-frame correlations of the video sequence and has large redundancy.


Geometric prediction merge mode (GPM) is a new tool introduced in the inter-prediction of Versatile Video Coding (VVC), which uses non-rectangular block partitions for coding unit (CU) partition to improve coding performance. To address the problem of large computational redundancy in the geometric prediction merge mode with motion vector refinement (GPM with MMVD), in this paper, a new decision algorithm is proposed based on CU gradient.


The quadtree nested multi-type trees (MTT) partitioning scheme is adopted by the latest generation of video coding standard H.266/VVC. While the performance of the encoder is improved, the complexity has increased by 2.2-5.6 times. It has been found that in the inter-coding mode, blocks with intense motion or complex texture tend to be further divided by the encoder. However, in the existing multi-tree partition algorithms, only the inter-frame motion information or intra-frame texture information has been utilized during partition.


Over the past decade, nonlinear image compression techniques based on neural networks have been rapidly developed to achieve more efficient storage and transmission of images compared with conventional linear techniques. A typical non-linear technique is implemented as a neural network trained on a vast set of images, and the latent representation of a target image is transmitted. In contrast to the previous nonlinear techniques, we propose a new image compression method in which a neural network model is trained exclusively on a single target image, rather than a set of images.

