Sorry, you need to enable JavaScript to visit this website.

The International Conference on Image Processing (ICIP), sponsored by the IEEE Signal Processing Society, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied image and video processing. ICIP has been held annually since 1994, brings together leading engineers and scientists in image and video processing from around the world. Visit website.

Portrait segmentation is becoming a hot topic nowadays.
In this paper we propose a novel framework to cope with
the high precision requirements that portrait segmentation
demands on boundary area by deep refinement of the
portrait matting. Our approach introduces three novel
techniques. First, a trimap is proposed by fusing information
coming from two well-known techniques for image
segmentation, i.e., Mask R-CNN and DensePose. Second,
an alpha matting algorithm runs over the previous trimap

Categories:
156 Views

Real-world recognition or classification tasks in computer vision are not apparent in controlled environments and often get involved in open set. Previous research work on real-world recognition problem is knowledge- and labor-intensive to pursue good performance for there are numbers of task domains. Auto Machine Learning (AutoML) approaches supply an easier way to apply advanced machine learning technologies, reduce the demand for experienced human experts and improve classification performance on close set.

Categories:
8 Views

In this paper, we introduce an end-to-end machine learning-based system for classifying autism spectrum disorder (ASD) using facial attributes such as expressions, action units, arousal, and valence. Our system classifies ASD using representations of different facial attributes from convolutional neural networks, which are trained on images in the wild. Our experimental results show that different facial attributes used in our system are statistically significant and improve sensitivity, specificity, and F1 score of ASD classification by a large margin.

Categories:
39 Views

Product placement, also called advertisement embedding, is to place some specific products in an image or a video, which may attract consumers to buy their products. However, adding advertisement objects in images is difficult, because where to add the product and how to fuse the background must be concerned. In this paper, to overcome this issue, we present a novel hierarchical framework with conditional generative adversarial network to add advertisement object in all kinds of scene images. The key point of our framework is leaning the relation between surrounding and products .

Categories:
17 Views

Even though zero padding is usually a staple in convolutional
neural networks to maintain the output size, it is highly suspicious
because it significantly alters the input distribution
around border region. To mitigate this problem, in this paper,
we propose a new padding technique termed as distribution
padding. The goal of the method is to approximately maintain
the statistics of the input border regions. We introduce
two different ways to achieve our goal. In both approaches,
the padded values are derived from the means of the border

Categories:
16 Views

We address the task of estimating depth from a single intensity image via a novel convolutional neural network (CNN) encoder-decoder architecture, which learns the depth information using example pairs of color images and their corresponding depth maps. The proposed model integrates residual connections within pooling and up-sampling layers, and hourglass networks which operate on the encoded features, thus processing these at various scales. Furthermore, the model is optimized under the constraints of perceptual as well as the mean squared error loss.

Categories:
58 Views

This paper presents a novel approach for continuous dynamic hand gesture recognition for RGB video input. Our approach contains two main modules. Firstly, in the gesture spotting module, the video sequence with continuous gestures are pre-segmented into isolated gestures. Secondly, the gesture classification module classifies the segmented gestures. In the gesture spotting module, the motion of the hand palm and finger movements are fed into Bidirectional Long Short-Term Memory (Bi-LSTM) network for gesture spotting purpose.

Categories:
48 Views

Due to the large number and huge diversity of attributes, pedestrian attribute recognition in video surveillance scenarios is a challenging task in the field of computer vision. Different from most previous works which only focus on extremely imbalanced attribute distribution problem, a new grouping way of attributes based multi-task convolutional neural network (MTCNN) is put forward, which exploits the spatial correlations among attributes and guarantees some independence of each attribute as well.

Categories:
41 Views

Pages