Sorry, you need to enable JavaScript to visit this website.

In this work, we propose a variable-rate scheme for deep video compression, which can achieve continuously variable rate by a single model. The key idea is to use the R-D tradeoff parameter \(\lambda\) as the conditional parameter to control the bitrate. The scheme is developed on DVC, which jointly learns motion estimation, motion compression, motion compensation, and residual compression functions. In this framework, the motion and residual compression auto-encoders are critical for the rate adaptation because they generate the final bitstream directly.

Categories:
29 Views

In this paper the authors present a novel structure of convolutional neural network for lossy image compression intended for use as a part of JPEG’s standard image compression stream. The network is trained on randomly selected images from high-quality image dataset of human faces and its effectiveness is verified experimentally using standard test images.

Categories:
75 Views

In the latest video coding standard, Versatile Video Coding (H.266/VVC), a new quadtree with nested multi-type tree (QTMT) coding block structure is proposed. QTMT significantly improves coding performance, but more complex block partitioning structure brings greater computational burden. To solve this problem, a fast intra block partition pattern pruning algorithm is proposed using gray level co-occurrence matrix (GLCM) to calculate texture direction information of coding units, terminating the horizontal or vertical split of the binary tree and the ternary tree in advance.

Categories:
28 Views

Recent years have witnessed the growth of point cloud based applications for both immersive media as well as 3D sensing for auto-driving, because of its realistic and fine-grained representation of 3D objects and scenes. However, it is a challenging problem to compress sparse, unstructured, and high-precision 3D points for efficient communication. In this paper, leveraging the sparsity nature of the point cloud, we propose a multiscale end-to-end learning framework that hierarchically reconstructs the 3D Point Cloud Geometry (PCG) via progressive re-sampling.

Categories:
161 Views

Light field imaging enables some post-processing capabilities like refocusing, changing view perspective, and depth estimation. As light field images are represented by multiple views they contain a huge amount of data that makes compression inevitable. Although there are some proposals to efficiently compress light field images, their main focus is on encoding efficiency. However, some important functionalities such as viewpoint and quality scalabilities, random access, and uniform quality distribution have not been addressed adequately.

Categories:
56 Views

JPEG has been a widely used lossy image compression codec for nearly three decades. The JPEG standard allows to use customized quantization table; however, it's still a challenging problem to find an optimal quantization table within acceptable computational cost. This work tries to solve the dilemma of balancing between computational cost and image specific optimality by introducing a new concept of texture mosaic images.

Categories:
30 Views

In this paper, a novel QP variable convolutional neural network based in-loop filter is proposed for VVC intra coding. To avoid training and deploying multiple networks, we develop an efficient QP attention module (QPAM) which can capture compression noise levels for different QPs and emphasize meaningful features along channel dimension. Then we embed QPAM into the residual block, and based on it, we design a network architecture that is equipped with controllability for different QPs.

Categories:
42 Views

Classical video coding for satisfying humans as the final user is a widely investigated field of studies for visual content, and common video codecs are all optimized for the human visual system (HVS). But are the assumptions and optimizations also valid when the compressed video stream is analyzed by a machine? To answer this question, we compared the performance of two state-of-the-art neural detection networks when being fed with deteriorated input images coded with HEVC and VVC in an autonomous driving scenario using intra coding.

Categories:
25 Views

Pages