Sorry, you need to enable JavaScript to visit this website.

ICIP 2021 - The International Conference on Image Processing (ICIP), sponsored by the IEEE Signal Processing Society, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied image and video processing. ICIP has been held annually since 1994, brings together leading engineers and scientists in image and video processing from around the world. Visit website.

Automatic License Plate Recognition (ALPR) for years has remained a persistent topic of research due to numerous practicable applications, especially in the Intelligent Transportation system (ITS). Many currently available solutions are still not robust in various real-world circumstances and often impose constraints like fixed backgrounds and constant distance and camera angles. This paper presents an efficient multi-language repudiate ALPR system based on machine learning.


In the last few years, single image super-resolution (SISR) has benefited a lot from the rapid development of deep convolutional neural networks (CNNs), and the introduction of attention mechanisms further improves the performance of SISR. However, previous methods use one or more types of attention independently in multiple stages and ignore the correlations between different layers in the network.


Images captured in low light condition have a narrow dynamic range with a dark tone, which are seriously degraded by noise due to the low signal-to-noise ratio (SNR). Discrete wavelet transform (DWT) is invertible and thus is able to decompose an image into subbands without information loss minimizing redundancy. In this paper, we propose subband adaptive enhancement of low light images using wavelet-based convolutional neural networks. We adopt DWT to achieve joint contrast enhancement and noise reduction. We combine DWT with convolutional neural networks (CNNs), i.e.


Tone-mapping is one of the prevailing methods to overcome high dynamic range imaging limitations over low dynamic range display devices, but the tone-mapped output image may suffer from saturated regions with texture and color information loss. In this paper, a novel approach is proposed to solve the so-called clipping problem in tone-mapped high dynamic range images. A successful saturation correction framework, which relies on linear embeddings, difference of pixel intensities and gradient-guided block-search, is developed as a post-processing technique to tone-mapping.


This paper proposes an effective technique for multi-exposure image fusion and visible-infrared image fusion problems. Multi-exposure fusion algorithms generally extract faulty weight maps when the input stack contains multiple and/or severely over-exposed images. To overcome this issue, an alternative method is developed for weight map characterization and refinement in addition to the perspectives of linear embeddings of images and adaptive morphological masking. This framework has then been extended to the visible and infrared image fusion problem.


In an industrial environment, object detection is a challenging task due to the absence of real images and real-time requirements for the object detector, usually embedded in a mobile device. Using 3D models, it is however possible to create a synthetic dataset to train a neural network, although the performance on real images is limited by the domain gap. In this paper, we study the performance of a Convolutional Neural Network (CNN) designed to detect objects in real-time: Single-Shot Detector (SSD) with a MobileNet backbone.


Erasing text from images is a common image-editing task in film industry and shared media. Existing text-erasing models either tend to produce artifacts or fail to remove all the text in real-world images. In this paper, we follow a two-stage text erasing framework that first masks the text by segmentation, and then inpaints the masked region to create a text-erased image. Our proposed text mask generator is designed to accurately cover text, which combined with inpainting, can produce reliable text-erased results.