Documents
Tutorial
Multi-Task Image and Video Compression
- Citation Author(s):
- Submitted by:
- Ivan Bajic
- Last updated:
- 30 July 2023 - 4:18pm
- Document Type:
- Tutorial
- Document Year:
- 2022
- Event:
- Presenters:
- Ivan V. Bajic
- Categories:
- Log in to post comments
Visual content is increasingly being used for more than human viewing. For example, traffic video is automatically analyzed to count vehicles, detect traffic violations, estimate traffic intensity, and recognize license plates; images uploaded to social media are automatically analyzed to detect and recognize people, organize images into thematic collections, and so on; visual sensors on autonomous vehicles analyze captured signals to help the vehicle navigate, avoid obstacles, collisions, and optimize their movement. The sheer amount of visual content used for purposes other than human viewing demands rethinking the traditional approaches for image and video compression.
This tutorial is about techniques for compressing images and video for multiple purposes, besides human viewing. We will start the first part of the tutorial by reviewing early attempts at tackling multi-task usage of compressed visual content. We will discuss several representative problems in “compressed-domain” image and video analysis, such as interest-point detection, face and person detection, saliency detection, and object tracking. We will briefly mention several MPEG standards for encoding features related to image and video analysis, such as Content Description for Visual Search (CDVS) and Content Description for Visual Analysis (CDVA).
The second part of the tutorial is devoted to the recent learning-based image and video compression methods, which offer much more flexibility for multi-task compression. We will review some basic concepts from information theory that will help appreciate subsequent material. We will then present several recent Deep Neural Network (DNN) models for image and video compression and how they might be used in multi-task compression. We will also discuss task-scalability and privacy in the context of multi-task compression. Finally, recent standardization activities related to multi-task compression, such as JPEG AI and MPEG Video Coding for Machines (VCM) will be reviewed.