Sorry, you need to enable JavaScript to visit this website.

Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of images generated per day, a question arises: how much better would an image codec targeting machine-consumption perform against state-of-the-art codecs targeting human-consumption? In this paper, we propose an image codec for machines which is neural network (NN) based and end-to-end learned.

Categories:
4 Views

Lossless compression of datasets is a problem of significant theoretical and practical interest. It appears naturally in the task of storing, sending, or archiving large collections of information for scientific research. We can greatly improve encoding bitrate if we allow the compression of the original dataset to decompress to a permutation of the data. We prove the equivalence of dataset compression to compressing a permutation-invariant structure of the data and implement such a scheme via predictive coding.

Categories:
8 Views

Rainy images degrade the visional performance that may bring down the accuracy of various applications. In this paper, we propose a novel densely connected network with Dense Feature Pyramid Grids Modules, called DFPGN, to solve the rain removal task. Specifically, in the proposed DFPG, there are five operations from different layers with various pathways and scales as the input of the current layer so that each layer can fuse various features from shallower and deeper ones to improve the deraining ability of the net- work.

Categories:
6 Views

Saliency-driven image and video coding for humans has gained importance in the recent past. In this paper, we propose such a saliency-driven coding framework for the video coding for machines task using the latest video coding standard Versatile Video Coding (VVC). To determine the salient regions before encoding, we employ the real-time-capable object detection network You Only Look Once (YOLO) in combination with a novel decision criterion. To measure the coding quality for a machine, the state-of-the-art object segmentation network Mask R-CNN was applied to the decoded frame.

Categories:
4 Views

Saliency-driven image and video coding for humans has gained importance in the recent past. In this paper, we propose such a saliency-driven coding framework for the video coding for machines task using the latest video coding standard Versatile Video Coding (VVC). To determine the salient regions before encoding, we employ the real-time-capable object detection network You Only Look Once (YOLO) in combination with a novel decision criterion. To measure the coding quality for a machine, the state-of-the-art object segmentation network Mask R-CNN was applied to the decoded frame.

Categories:
43 Views

Slides and poster presented during ICASSP 2021 about our work on "Relying on a rate constraint to reduce Motion Estimation complexity".

Categories:
3 Views

Pages