Sorry, you need to enable JavaScript to visit this website.

End-to-end image compression has achieved satisfactory results in recent years. However, existing methods suffer from the high complexity of complicated neural networks and cannot be directly deployed on mobile devices due to the limitations of computation and storage. Therefore, considering the resource and computing ability constrains of the mobile devices, we make a trade-off in this paper between rate-distortion (R-D) performance, inference time, and model complexity.

Categories:
8 Views

Visual content is increasingly being used for more than human viewing. For example, traffic video is automatically analyzed to count vehicles, detect traffic violations, estimate traffic intensity, and recognize license plates; images uploaded to social media are automatically analyzed to detect and recognize people, organize images into thematic collections, and so on; visual sensors on autonomous vehicles analyze captured signals to help the vehicle navigate, avoid obstacles, collisions, and optimize their movement.

Categories:
22 Views

This supplementary material presents detailed experimental results for ICIP 2024 submission titled "Standard Compliant Video Coding using Low Complexity, Switchable Neural Wrappers".

Categories:
28 Views

Effective compression of 360-degree images, also referred to as omnidirectional images (ODIs), is of high interest for various virtual reality (VR) and related applications. 2D image compression methods ignore the equator-biased nature of ODIs and fail to address oversampling near the poles, leading to inefficient compression when applied to ODI. We present a new learned saliency-aware 360-degree image compression architecture that prioritizes bit allocation to more significant regions, considering the unique properties of ODIs.

Categories:
29 Views

Video Multimethod Assessment Fusion (VMAF) [1],[2],[3] is a popular tool in the industry for measuring coded video quality. In this study, we propose an auditory-inspired frontend in existing VMAF for creating videos of reference and coded spectrograms, and extended VMAF for measuring coded audio quality. We name our system AudioVMAF. We demonstrate that image replication is capable of further enhancing prediction accuracy, especially when band-limited anchors are present.

Categories:
57 Views

Deep learning (DL) based tools have recently reached performance levels similar to state-of-the-art hand-crafted methods for Point Cloud (PC) coding and classification. In 2022, JPEG issued a Call for Proposals for a Learning-based PC Coding (PCC) standard that envisions a unified representation, targeting both human visualization and computer vision tasks. This paper proposes the first DL-based Compressed Domain PC CLassifier (CD-PCCL), built on the PointGrid classifier, for geometry-only PCs coded with the current DL-based JPEG Pleno PCC Verification Model.

Categories:
12 Views

Neural network based image compression has made significant progress in recent years. The learned image codecs are commonly reported to outperform their conventional counterparts in perceptual quality. Despite the superior performance, the learned image codecs are much more complex to decode, which hinders their usage in practice. Without a significant advance in hardware capability, the conventional image codec will likely remain a primary component for large scale image services. It is therefore desirable to improve the quality of conventional image codecs.

Categories:
40 Views

The coded aperture snapshot spectral imager (CASSI) system senses spatial and spectral information using a binary coded aperture and a dispersive element, thus the quality of reconstructed hyperspectral images is mainly determined by the structure of coded apertures. Traditional coded apertures (Random, Bernoulli, etc.), encoding hyperspectral images in focal array plane, suffer from suboptimal reconstruction accuracy. Therefore, optimizing coded aperture design improves the reconstruction quality for the scene.

Categories:
79 Views

Visual content is increasingly being used for more than human viewing. For example, traffic video is automatically analyzed to count vehicles, detect traffic violations, estimate traffic intensity, and recognize license plates; images uploaded to social media are automatically analyzed to detect and recognize people, organize images into thematic collections, and so on; visual sensors on autonomous vehicles analyze captured signals to help the vehicle navigate, avoid obstacles, collisions, and optimize their movement.

Categories:
307 Views

Pages