- Image/Video Storage, Retrieval
- Image/Video Processing
- Image/Video Coding
- Image Scanning, Display, and Printing
- Image Formation
- Read more about MTIDNET: A MULTIMODAL TEMPORAL INTEREST DETECTION NETWORK FOR VIDEO SUMMARIZATION
- Log in to post comments
Video summarization involves creating a succinct overview by merging the valuable parts of a video. Existing video summarization
methods approach this task as a problem of selecting keyframes
by frame- and shot-level techniques with unimodal or bimodal information. Besides underestimated inter-relations between various
configurations of modality embedding spaces, current methods are
also limited in their ability to maintain the integrity of the semantics within the same summary segment. To address these issues,
icassp.pptx
- Categories:
- Read more about ATTENTIONLUT: ATTENTION FUSION-BASED CANONICAL POLYADIC LUT FOR REAL-TIME IMAGE ENHANCEMENT
- Log in to post comments
Recently, many algorithms have employed image-adaptive lookup tables (LUTs) to achieve real-time image enhancement. Nonetheless, a prevailing trend among existing methods has been the employment of linear combinations of basic LUTs to formulate image-adaptive LUTs, which limits the generalization ability of these methods. To address this limitation, we propose a novel framework named AttentionLut for real-time image enhancement, which utilizes the attention mechanism to generate image-adaptive LUTs. Our proposed framework consists of three lightweight modules.
icassp_poster.pdf
- Categories:
- Read more about SELF-SUPERVISED MULTI-SCALE HIERARCHICAL REFINEMENT METHOD FOR JOINT LEARNING OF OPTICAL FLOW AND DEPTH
- Log in to post comments
Recurrently refining the optical flow based on a single highresolution feature demonstrates high performance. We exploit the strength of this strategy to build a novel architecture for the joint learning of optical flow and depth. Our proposed architecture is improved to work in the case of training on unlabeled data, which is extremely challenging. The loss is computed for the iterations carried out over a single high-resolution feature, where the reconstruction loss fails to optimize the accuracy particularity in occluded regions.
- Categories:
- Read more about LIGHTING IMAGE/VIDEO STYLE TRANSFER METHODS BY ITERATIVE CHANNEL PRUNING
- Log in to post comments
Deploying style transfer methods on resource-constrained devices is challenging, which limits their real-world applicability. To tackle this issue, we propose using pruning techniques to accelerate various visual style transfer methods. We argue that typical pruning methods may not be well-suited for style transfer methods and present an iterative correlation-based channel pruning (ICCP) strategy for encoder-transform-decoder-based image/video style transfer models.
- Categories:
- Read more about LEARNING SPATIO-TEMPORAL RELATIONS WITH MULTI-SCALE INTEGRATED PERCEPTION FOR VIDEO ANOMALY DETECTION
- Log in to post comments
In weakly supervised video anomaly detection, it has been verified that anomalies can be biased by background noise. Previous works attempted to focus on local regions to exclude irrelevant information. However, the abnormal events in different scenes vary in size, and current methods struggle to consider local events of different scales concurrently. To this end, we propose a multi-scale integrated perception
- Categories:
- Read more about DCL-NET DUAL CONTRASTIVE LEARNING NETWORK FOR SEMI-SUPERVISED MULTI-ORGAN SEGMENTATION
- Log in to post comments
Semi-supervised learning (SSL) is a sound measure to relieve the strict demand of abundant annotated datasets, especially for challenging multi-organ segmentation (MoS). However, most existing SSL methods predict pixels in a single image independently, ignoring the relations among images and categories. In this paper, we propose a two-stage Dual Contrastive Learning Network (DCL-Net) for semi-supervised MoS, which utilizes global and local contrastive learning to strengthen the relations among images and classes.
- Categories:
- Read more about DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
- Log in to post comments
In this paper, we present the decomposed triplane-hash neural radiance fields (DT-NeRF), a framework that significantly improves the photorealistic rendering of talking faces and achieves state-of-the-art results on key evaluation datasets. Our architecture decomposes the facial region into two specialized triplanes: one specialized for representing the mouth, and the other for the broader facial features. We introduce audio features as residual terms and integrate them as query vectors into our model through an audio-mouthface transformer.
- Categories:
- Read more about Enhancing Adversarial Transferability in Object Detection with Bidirectional Feature Distortion
- Log in to post comments
Previous works have shown that perturbing internal-layer features can significantly enhance the transferability of black-box attacks in classifiers. However, these methods have not achieved satisfactory performance when applied to detectors due to the inherent differences in features between detectors and classifiers. In this paper, we introduce a concise and practical untargeted adversarial attack in a label-free manner, which leverages only the feature extracted from the backbone model.
- Categories:
- Read more about MODALITY-DEPENDENT SENTIMENTS EXPLORING FOR MULTI-MODAL SENTIMENT CLASSIFICATION
- Log in to post comments
Recognizing human feelings from image and text is a core
challenge of multi-modal data analysis, often applied in personalized advertising. Previous works aim at exploring the
shared features, which are the matched contents between
images and texts. However, the modality-dependent sentiment information (private features) in each modality is usually ignored by cross-modal interactions, the real sentiment
is often reflected in one modality. In this paper, we propose a Modality-Dependent Sentiment Exploring framework
- Categories:
- Read more about SDRNET: SALIENCY-GUIDED DYNAMIC RESTORATION NETWORK FOR RAIN AND HAZE REMOVAL IN NIGHTTIME IMAGES
- Log in to post comments
Due to the different physical imaging models, most haze or rain removal methods for daytime images are not suitable for nighttime images. Fog effect produced by the accumu-lation of rain also brings great challenges to the restoration of lowlight nighttime images. To deal well with the multi-ple noise interference in this complex situation, we propose a saliency-guided dynamic restoration network (SDRNet) that can remove rain and haze in nighttime scenes. First, a saliency-guided detail enhancement preprocessing method is designed to get images with clearer details as the auxilia-ry input.
- Categories: