Image/Video Processing

Spectro-spatial hyperspectral image reconstruction from interferometric acquisitions

Read more about Spectro-spatial hyperspectral image reconstruction from interferometric acquisitions
Log in to post comments

In the last decade, novel hyperspectral cameras have been developed with particularly desirable characteristics of compactness and short acquisition time, retaining their potential to obtain spectral/spatial resolution competitive with respect to traditional cameras. However, a computational effort is required to recover an interpretable data cube.

poster.pdf

poster.pdf (107)

Categories:: Image/Video Processing

19 Views

RESIDUAL DENSE SWIN TRANSFORMER FOR CONTINUOUS DEPTH-INDEPENDENT ULTRASOUND IMAGING

Read more about RESIDUAL DENSE SWIN TRANSFORMER FOR CONTINUOUS DEPTH-INDEPENDENT ULTRASOUND IMAGING
Log in to post comments

Ultrasound imaging is crucial for evaluating organ morphology and function, yet depth adjustment can degrade image quality and field-of-view, presenting a depth-dependent dilemma. Traditional interpolation-based zoom-in techniques often sacrifice detail and introduce artifacts. Motivated by the potential of arbitrary-scale super-resolution to naturally address these inherent challenges, we present the Residual Dense Swin Transformer Network (RDSTN), designed to capture the non-local characteristics and long-range dependencies intrinsic to ultrasound images.

poster.pdf

Poster for ICASSP2024 work "RESIDUAL DENSE SWIN TRANSFORMER FOR CONTINUOUS DEPTH-INDEPENDENT ULTRASOUND IMAGING" (115)

Categories:: Medical image analysis
Image/Video Processing

9 Views

CROSS-LINGUAL LEARNING IN MULTILINGUAL SCENE TEXT RECOGNITION

Read more about CROSS-LINGUAL LEARNING IN MULTILINGUAL SCENE TEXT RECOGNITION
Log in to post comments

In this paper, we investigate cross-lingual learning (CLL) for multilingual scene text recognition (STR). CLL transfers knowledge from one language to another. We aim to find the condition that exploits knowledge from high-resource languages for improving performance in low-resource languages. To do so, we first examine if two general insights about CLL discussed in previous works are applied to multilingual STR: (1) Joint learning with high- and low-resource languages may reduce performance on low-resource languages, and (2) CLL works best between typologically similar languages.

ICASSP2024 poster.pdf

ICASSP2024 poster.pdf (121)

Categories:: Machine Learning for Signal Processing
Image/Video Processing

22 Views

High-order Tensor Pooling with Attention for Action Recognition

Read more about High-order Tensor Pooling with Attention for Action Recognition
Log in to post comments

We aim at capturing high-order statistics of feature vectors formed by a neural network, and propose end-to-end second- and higher-order pooling to form a tensor descriptor. Tensor descriptors require a robust similarity measure due to low numbers of aggregated vectors and the burstiness phenomenon, when a given feature appears more/less frequently than statistically expected. The Heat Diffusion Process (HDP) on a graph Laplacian is closely related to the Eigenvalue Power Normalization (EPN) of the covariance/auto-correlation matrix, whose inverse forms a loopy graph Laplacian.

icassp24_hot_lecture.pdf

Presentation slides (108)

icassp24_hop_suppl.pdf

Supplementary material (113)

Categories:: Image/Video Processing

46 Views

UNSUPERVISED REMOTE SENSING HAZE REMOVAL BASED ON SALIENCY-GUIDED TRANSMISSION REFINEMENT

Haze causes information loss and quality degradation in remote sensing images. Unsupervised learning-based dehazing methods aim to reduce reliance on paired hazy images and their labels. However, complex mapping relationships often increase the difficulty in network convergence, resulting in color distortion and loss of texture details in remote sensing images. To address these issues, we propose an unsupervised haze removal method based on saliency-guided transmission refinement for remote sensing images.

ICASSP-8620.pdf

ICASSP-8620.pdf (108)

Categories:: Image/Video Processing

24 Views

BOOSTING ZERO-SHOT HUMAN-OBJECT INTERACTION DETECTION WITH VISION-LANGUAGE TRANSFER

Read more about BOOSTING ZERO-SHOT HUMAN-OBJECT INTERACTION DETECTION WITH VISION-LANGUAGE TRANSFER
Log in to post comments

Human-Object Interaction (HOI) detection is a crucial task that involves localizing interactive human-object pairs and identifying the actions being performed. Most existing HOI detectors are supervised in nature and lack the ability of zero-shot discovery of unseen interactions. Recently, transformer-based methods have superseded the traditional CNN detectors by aggregating image-wide context but still suffer from the long-tail distribution problem in HOI. In this work, our primary focus is improving HOI detection in images, particularly in zero-shot scenarios.

poster.pdf

Sarma_ZSHOI_ICASSP_2024_poster (134)

Categories:: Image/Video Processing

50 Views

Flow Dynamics Correction for Action Recognition

Read more about Flow Dynamics Correction for Action Recognition
Log in to post comments

Various research studies indicate that action recognition performance highly depends on the types of motions being extracted and how accurate the human actions are represented. In this paper, we investigate different optical flow, and features extracted from these optical flow that capturing both short-term and long-term motion dynamics. We perform power normalization on the magnitude component of optical flow for flow dynamics correction to boost subtle or dampen sudden motions.

icassp24_hal_poster.pdf

Poster for Flow Dynamics Correction for Action Recognition (ICASSP’24) (112)

Categories:: Image/Video Processing

40 Views

A COMPREHENSIVE FRAMEWORK FOR OCCLUDED HUMAN POSE ESTIMATION

Read more about A COMPREHENSIVE FRAMEWORK FOR OCCLUDED HUMAN POSE ESTIMATION
Log in to post comments

Occlusion presents a significant challenge in human pose estimation. The challenges posed by occlusion can be attributed to the following factors: 1) Data: The collection and annotation of occluded human pose samples are relatively challenging. 2) Feature: Occlusion can cause feature confusion due to the high similarity between the target person and interfering individuals. 3) Inference: Robust inference becomes challenging due to the loss of complete body structural information.

icassp2024.pdf

icassp2024.pdf (172)

Categories:: Image/Video Processing

22 Views

PHASE LEARNING BASED ON INTERACTIVE PERCEPTION FOR LIMITED-SAMPLE RESIDENTIAL AREA SEMANTIC SEGMENTATION

Due to the rich details of residential areas and the characteristics of remote sensing image sharpness vulnerable to haze, it will not only consume a lot of labor costs but also be very difficult to produce a large-scale dataset with strong labels. Therefore, the limited-sample dataset has become a hotspot in recent years. To address this issue, we proposed a semantic segmentation method for residential areas by phase learning.

lyu-icassp2024-paper2-final.pdf

lyu-icassp2024-paper2-final.pdf (100)

Categories:: Image/Video Processing

8 Views

HAZY REMOTE SENSING IMAGES SEMANTIC SEGMENTATION FOR WEAKLY ANNOTATION BASED ON SALIENCY-AWARE ALIGNMENT STRATEGY

The technique of semantic segmentation (SS) holds significant importance in the domain of remote sensing image (RSI) processing. The current research primarily encompasses two problems: 1) RSIs are easily affected by clouds and haze; 2) SS based on strong annotation requires vast human and time costs. In this paper, we propose a weakly supervised semantic segmentation (WSSS) method for hazy RSIs based on saliency-aware alignment strategy. Firstly, we design alignment network (AN) and target network (TN) with the same structure.

XU-poster.pdf

XU-poster.pdf (126)

Categories:: Image/Video Processing

24 Views

Image/Video Processing

Pages