IEEE ICIP 2024

IEEE ICIP 2024 - The International Conference on Image Processing (ICIP), sponsored by the IEEE Signal Processing Society, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied image and video processing. ICIP has been held annually since 1994, brings together leading engineers and scientists in image and video processing from around the world. Visit website.

LENSLESS PHASE RETRIEVAL WITH REGULARIZATION BY BLIND NOISE MAP ESTIMATION AND DENOISING

It is presentation slides for the ICIP 2024 paper, where we addressed the challenge of regularization in lensless single-shot phase retrieval (PR) by noise suppression.

LENSLESS PHASE RETRIEVAL WITH REGULARIZATION BY BLIND NOISE MAP ESTIMATION AND DENOISING.pdf

LENSLESS PHASE RETRIEVAL WITH REGULARIZATION BY BLIND NOISE MAP ESTIMATION AND DENOISING.pdf (123)

Categories:: Image/Video Processing

12 Views

CASCADING UNKNOWN DETECTION WITH KNOWN CLASSIFICATION FOR OPEN SET RECOGNITION

Read more about CASCADING UNKNOWN DETECTION WITH KNOWN CLASSIFICATION FOR OPEN SET RECOGNITION
Log in to post comments

Deep learners tend to perform well when trained under the closed set assumption but struggle when deployed under open set conditions. This motivates the field of Open Set Recognition in which we seek to give deep learners the ability to recognize whether a data sample belongs to the known classes trained on or comes from the surrounding infinite world. Existing open set recognition methods typically rely upon a single function for the dual task of distinguishing between knowns and unknowns as well as making known class distinction.

Cascading Unknown Detection with Known Classification for OSR.pdf

Cascading Unknown Detection with Known Classification for OSR.pdf (152)

Categories:: Pattern recognition and classification (MLR-PATT)

18 Views

STREAMLINED HYBRID ANNOTATION FRAMEWORK USING SCALABLE CODESTREAM FOR BANDWIDTH-RESTRICTED UAV OBJECT DETECTION

Emergency response missions depend on the fast relay of visual information, a task to which unmanned aerial vehicles are well adapted. However, the effective use of unmanned aerial vehicles is often compromised by bandwidth limitations that impede fast data transmission, thereby delaying the quick decision-making necessary in emergency situations. To address these challenges, this paper presents a streamlined hybrid annotation framework that utilizes the JPEG 2000 compression algorithm to facilitate object detection under limited bandwidth.

ICIP24_poster_2085.pdf

ICIP24_poster_2085.pdf (108)

Categories:: Image/Video Processing

19 Views

SEMI-SUPERVISED GRAPHICAL DEEP DICTIONARY LEARNING FOR HYPERSPECTRAL IMAGE CLASSIFICATION FROM LIMITED SAMPLES

In this work, we propose a semi-supervised deep feature generation network that accounts for local similarities. It is based on the deep dictionary learning (DDL) framework. The formulation accounts for two unique aspects of hyperspectral classification. First, the fact that the total number of pixels / samples to be labeled is constant; this allows for a semi-supervised formulation allowing only a few pixels / samples to be labeled as training data. Second, the samples / pixels are spatially correlated; this leads to a graph regularization formulation.

ICIP_poster_2.pdf

ICIP_poster_2.pdf (133)

Categories:: Other

12 Views

Non-separable Wavelet Transform Using Learnable Convolutional Lifting Steps

Read more about Non-separable Wavelet Transform Using Learnable Convolutional Lifting Steps
Log in to post comments

Wavelet transforms have been a relevant topic in signal processing for many years. One of the most common strategies when designing wavelet transforms is the use of lifting schemes, known for their perfect reconstruction properties and flexible design. This paper introduces a novel 2D non-separable lifting design methodology based on deep learning architectures. The proposed method is assessed within the context of end-to-end lossless image compression.

ICIP 24 Presentation - Non-separable Wavelet Transform Using Learnable Convolutional Lifting Steps.pdf

ICIP 24 Presentation - Non-separable Wavelet Transform Using Learnable Convolutional Lifting Steps.pdf (213)

Categories:: Multimedia Signal Processing

26 Views

Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification

Read more about Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification
Log in to post comments

We revisit language bottleneck models as an approach to ensuring the explainability of deep learning models for image classification. Because of inevitable information loss incurred in the step of converting images into language, the accuracy of language bottleneck models is considered to be inferior to that of standard black-box models. Recent image captioners based on large-scale foundation models of Vision and Language, however, have the ability to accurately describe images in verbal detail to a degree that was previously believed to not be realistically possible.

ICIP Poster Presentation - 2521.pdf

ICIP Poster Presentation - 2521.pdf (141)

Categories:: Image/Video Processing

181 Views

Rethinking temporal self-similarity for repetitive action counting

Read more about Rethinking temporal self-similarity for repetitive action counting
Log in to post comments

Counting repetitive actions in long untrimmed videos is a challenging task that has many applications such as rehabilitation.

ICIP24_RACnet_supp.pdf

RACnet supplementary material (176)

Categories:: Image/Video Processing

136 Views

On the exploitation of DCT-traces in the Generative-AI domain

Read more about On the exploitation of DCT-traces in the Generative-AI domain
1 comment
Log in to post comments

Deepfakes represent one of the toughest challenges in the world of Cybersecurity and Digital Forensics, especially considering the high-quality results obtained with recent generative AI-based solutions. Almost all generative models leave unique traces in synthetic data that, if analyzed and identified in detail, can be exploited to improve the generalization limitations of existing deepfake detectors.

ICIP_2024___DCT_Explainability (14).pdf

supplementary material (175)

Categories:: Other

192 Views

Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation

Multi-class multi-instance segmentation is the task of identifying masks for multiple object classes and multiple instances of the same class within an image. The Segment Anything Model (SAM) is a new foundation model designed for promptable multi-class multi-instance segmentation. SAM is able to segment objects in any image using a pre-defined point grid as an input prompt in the ``everything'' mode. However, out of the box SAM tends to output part or sub-part segmentation masks (under-segmentation) in different real-world applications.

ICIP_2024_suppl_final.pdf

Supplementary Materials: Segment Any Object Model (SAOM) (182)

ICIP presentation.pptx

Presentation slides (98)

Categories:: Image/Video Processing

80 Views

Pages