
IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

- Read more about Image Generation is MAY All You Need for VQA
- Log in to post comments
Visual Question Answering (VQA) stands to benefit from the boost of increasingly sophisticated Pretrained Language Model (PLM) and Computer Vision-based models. In particular, many language modality studies have been conducted using image captioning or question generation with the knowledge ground of PLM in terms of data augmentation. However, image generation of VQA has been implemented in a limited way to modify only certain parts of the original image in order to control the quality and uncertainty.
- Categories:

- Read more about Hypernetwork-based Adaptive Image Restoration
- 1 comment
- Log in to post comments
Adaptive image restoration models can restore images with different degradation levels at inference time without the need to retrain the model. We present an approach that is highly accurate and allows a significant reduction in the number of parameters. In contrast to existing methods, our approach can restore images using a single fixed-size model, regardless of the number of degradation levels. On popular datasets, our approach yields state-of-the-art results in terms of size and accuracy for a variety of image restoration tasks, including denoising, deJPEG, and super-resolution.
- Categories:

- Read more about Pondering about Task Spatial Misalignment: Classification-Localization Equilibrated Object Detection
- Log in to post comments
Object detection is a fundamental task in computer vision, consisting of both classification and localization tasks. Previous works mostly perform classification and localization with shared feature extractor like Convolution Neural Network. However, the tasks of classification and localization exhibit different sensitivities with regard to the same feature, hence the "task spatial misalignment" issue. This issue can result in a hedge issue between the performances of localizer and classifier.
- Categories:

- Read more about GaitMixer: Skeleton-based Gait Representation Learning via Wide-spectrum Multi-axial Mixer
- Log in to post comments
Most existing gait recognition methods are appearance-based, which rely on the silhouettes extracted from the video data of human walking activities. The less-investigated skeleton-based gait recognition methods directly learn the gait dynamics from 2D/3D human skeleton sequences, which are theoretically more robust solutions in the presence of appearance changes caused by clothes, hairstyles, and carrying objects. However, the performance of skeleton-based solutions is still largely behind the appearance-based ones.
- Categories:

- Read more about SPATIAL INFERENCE USING CENSORED MULTIPLE TESTING WITH FDR CONTROL
- Log in to post comments
A wireless sensor network performs spatial inference on a physical phenomenon of interest. The areas in which this phenomenon exhibits interesting or anomalous behavior are identified whilst controlling false positives. We expand our previous work based on multiple hypothesis testing (MHT) and local false discovery rates to save energy and reduce spectrum use. The number of transmissions from sensors producing uninformative statistics are reduced by introducing censoring for MHT that imposes a communication rate constraint while maintaining the desired performance.
- Categories: