Image, Video, and Multidimensional Signal Processing

Semantic-based Sentence Recognition in Images Using Bimodal Deep Learning

Read more about Semantic-based Sentence Recognition in Images Using Bimodal Deep Learning
1 comment
Log in to post comments

presentation_slides_YiZheng.pdf

presentation_slides_YiZheng.pdf (91)

Categories:: Image, Video, and Multidimensional Signal Processing

26 Views

Semi-Supervised Object Detection with Sparsely Annotated Dataset

Read more about Semi-Supervised Object Detection with Sparsely Annotated Dataset
Log in to post comments

When training an anchor-based object detector with a sparsely annotated dataset, the effort required to locate positive examples can cause performance degradation. Because anchor-based object detection models collect positive examples under IoU between anchors and ground-truth bounding boxes, in a sparsely annotated image, some objects that are not annotated can be assigned as negative examples, such as backgrounds.

ICIP2021 paper - SEMI-SUPERVISED OBJECT DETECTION WITH SPARSELY ANNOTATED DATASET.pdf

ICIP2021 paper - SEMI-SUPERVISED OBJECT DETECTION WITH SPARSELY ANNOTATED DATASET.pdf (465)

Categories:: Image, Video, and Multidimensional Signal Processing

56 Views

Inverse Halftone Colorization: Making Halftone Prints Color Photos

Read more about Inverse Halftone Colorization: Making Halftone Prints Color Photos
Log in to post comments

InverseHalftoneColorization_poster.pdf

InverseHalftoneColorization_poster.pdf (126)

InverseHalftoneColorization_presentation.pdf

InverseHalftoneColorization_presentation.pdf (219)

Categories:: Image, Video, and Multidimensional Signal Processing

25 Views

M3VSNet: Unsupervised Multi-metric Multi-view Stereo Network

Read more about M3VSNet: Unsupervised Multi-metric Multi-view Stereo Network
Log in to post comments

The present Multi-view stereo (MVS) methods with supervised learning-based networks have an impressive performance comparing with traditional MVS methods. However, the ground-truth depth maps for training are hard to be obtained and are within limited kinds of scenarios. In this paper, we propose a novel unsupervised multi-metric MVS network, named M^3VSNet, for dense point cloud reconstruction without any supervision.

slide1091.pdf

slide1091.pdf (147)

Categories:: Image, Video, and Multidimensional Signal Processing

46 Views

SOLVING FOURIER PHASE RETRIEVAL WITH A REFERENCE IMAGE AS A SEQUENCE OF LINEAR INVERSE PROBLEMS

ICIP2021_Fahimeh_Poster.pdf

Poster for the paper titled "SOLVING FOURIER PHASE RETRIEVAL WITH A REFERENCE IMAGE AS A SEQUENCE OF LINEAR INVERS" (159)

Sequential Fourier Phase Retrieval ICIP 2021.pdf

Presentation Slides for the paper titled "SOLVING FOURIER PHASE RETRIEVAL WITH A REFERENCE IMAGE AS A SEQUENCE OF LINEAR INVERS" (193)

Categories:: Image, Video, and Multidimensional Signal Processing
Other

14 Views

Adversarial Unsupervised Video Summarization Augmented with Dictionary Loss

Read more about Adversarial Unsupervised Video Summarization Augmented with Dictionary Loss
Log in to post comments

Automated unsupervised video summarization by key-frame extraction consists in identifying representative video frames, best abridging a complete input sequence, and temporally ordering them to form a video summary, without relying on manually constructed ground-truth key-frame sets. State-of-the-art unsupervised deep neural approaches consider the desired summary to be a subset of the original sequence, composed of video frames that are sufficient to visually reconstruct the entire input.

Poster_ICIP_Kaseris.pdf

Poster_ICIP_Kaseris.pdf (174)

Categories:: Image, Video, and Multidimensional Signal Processing

21 Views

INTEGRATED GRAD-CAM: SENSITIVITY-AWARE VISUAL EXPLANATION OF DEEP CONVOLUTIONAL NETWORKS VIA INTEGRATED GRADIENT-BASED SCORING

Visualizing the features captured by Convolutional Neural Networks (CNNs) is one of the conventional approaches to interpret the predictions made by these models in numerous image recognition applications. Grad-CAM is a popular solution that provides such a visualization by combining the activation maps obtained from the model.However, the average gradient-based terms deployed in this method under-estimates the contribution of the representations discovered by the model to its predictions.

ICASSP-IGCAM.pdf

Presentation slide deck of IGCAM XAI algorithm (152)

IG-CAM_Poster.pdf

Poster of IGCAM XAI algorithm (175)

Categories:: Image, Video, and Multidimensional Signal Processing

9 Views

ADA-SISE: ADAPTIVE SEMANTIC INPUT SAMPLING FOR EFFICIENT EXPLANATION OF CONVOLUTIONAL NEURAL NETWORKS

Explainable AI (XAI) is an active research area to interpret a neural network’s decision by ensuring transparency and trust in the task-specified learned models.Recently,perturbation-based model analysis has shown better interpretation, but back-propagation techniques are still prevailing because of their computational efficiency. In this work, we combine both approaches as a hybrid visual explanation algorithm and propose an efficient interpretation method for convolutional neural networks.

ICASSP-AdaSISE-slides.pdf

Presentation slide deck of Ada-SISE XAI algorithm (330)

4216.pdf

Poster of Ada-SISE XAI algorithm (158)

Categories:: Image, Video, and Multidimensional Signal Processing
Neural network learning (MLR-NNLR)

8 Views

LIGHT FIELD STYLE TRANSFER WITH LOCAL ANGULAR CONSISTENCY

Read more about LIGHT FIELD STYLE TRANSFER WITH LOCAL ANGULAR CONSISTENCY
Log in to post comments

ICASSP2021_Poster.pdf

ICASSP2021_Poster.pdf (256)

Categories:: Image, Video, and Multidimensional Signal Processing

3 Views

MULTI-GRANULARITY FEATURE INTERACTION AND RELATION REASONING FOR 3D DENSE ALIGNMENT AND FACE RECONSTRUCTION

In this paper, we propose a multi-granularity feature interaction and relation reasoning network (MFIRRN) which can recover a detail-rich 3D face and perform more accurate dense alignment in an unconstrained environment. Traditional 3DMM-based methods directly regress parameters, resulting in the lack of fine-grained details in the reconstruction 3D face. To this end, we use different branches to capture discriminative features at different granularities, especially local features at medium and fine granularities.

ICASSP-poster.pdf

ICASSP-poster.pdf (339)

Categories:: Image, Video, and Multidimensional Signal Processing

16 Views

Image, Video, and Multidimensional Signal Processing

Pages