Sorry, you need to enable JavaScript to visit this website.

Multispectral image fusion is a computer vision process that is essential to remote sensing. For applications such as dehazing and object detection, there is a need to offer solutions that can perform in real-time on any type of scene. Unfortunately, current state-of-the-art approaches do not meet these criteria as they need to be trained on domain-specific data and have high computational complexity. This paper focuses on the task of fusing color (RGB) and near-infrared (NIR) images as this the typical RGBT sensors, as in multispectral cameras for detection, fusion, and dehazing.

Categories:
35 Views

Adaptive image restoration models can restore images with different degradation levels at inference time without the need to retrain the model. We present an approach that is highly accurate and allows a significant reduction in the number of parameters. In contrast to existing methods, our approach can restore images using a single fixed-size model, regardless of the number of degradation levels. On popular datasets, our approach yields state-of-the-art results in terms of size and accuracy for a variety of image restoration tasks, including denoising, deJPEG, and super-resolution.

Categories:
61 Views

Object detection is a fundamental task in computer vision, consisting of both classification and localization tasks. Previous works mostly perform classification and localization with shared feature extractor like Convolution Neural Network. However, the tasks of classification and localization exhibit different sensitivities with regard to the same feature, hence the "task spatial misalignment" issue. This issue can result in a hedge issue between the performances of localizer and classifier.

Categories:
20 Views

Most existing gait recognition methods are appearance-based, which rely on the silhouettes extracted from the video data of human walking activities. The less-investigated skeleton-based gait recognition methods directly learn the gait dynamics from 2D/3D human skeleton sequences, which are theoretically more robust solutions in the presence of appearance changes caused by clothes, hairstyles, and carrying objects. However, the performance of skeleton-based solutions is still largely behind the appearance-based ones.

Categories:
21 Views

Subjective image-quality measurement plays a critical role in the development of image- processing applications. The purpose of a visual-quality metric is to approximate the results of subjective assessment. In this regard, more and more metrics are under development, but little research has considered their limitations. This paper addresses that deficiency: we show how image preprocessing before compression can artificially increase the quality scores provided by the popular metrics DISTS, LPIPS, HaarPSI, and VIF as well as how these scores are inconsistent with subjective-quality scores.

Categories:
32 Views

Most video platforms provide video streaming services with different qualities, and the resolution of the videos usually adjusts the quality of the services. So high-resolution videos need to be downsampled for compression. In order to solve the problem of video coding at different resolutions, we propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding.

Categories:
35 Views

Video quality assessment (VQA) for user generated content (UGC) videos plays important role in video compression and processing. Convolutional neural network (CNN) based quality assessment for UGC is the research focus with inspiring model accuracy increment in the past three years. However, regularly temporal-sampling with temporal feature loss, as well as fixed token selection strategy video transformer (ViT) with insufficient representational capacity of tokens, jointly degrade the accuracy of conventional ViT based quality assessment.

Categories:
71 Views

By exploiting the potential of deep learning, video compressive sensing (CS) has achieved tremendous improvement recently. Due to the video CS is mainly served for the fixed scene in real life. In this paper, we propose a novel video compressive sensing with a low-complexity region-of-interest (ROI) detection method (VCSL). The ROI is located by calculating the difference between the reference frame and the following frames in our framework, which is compact without introducing any additional neural networks and parameters.

Categories:
48 Views

Pages