Sorry, you need to enable JavaScript to visit this website.

The International Conference on Image Processing (ICIP), sponsored by the IEEE Signal Processing Society, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied image and video processing. ICIP has been held annually since 1994, brings together leading engineers and scientists in image and video processing from around the world. Visit website.

In this paper we aim to find exactly the same shoes given a daily shoe photo (street scenario) that matches the online shop shoe photo (shop scenario). There are large visual differences between the street and shop scenario shoe images. To handle the discrepancy of different scenarios, we learn a feature embedding for shoes via a viewpoint-invariant triplet network, the feature activations of which reflect the inherent similarity between any two shoe images.

Categories:
16 Views

This paper proposes a video summarization method based on novel spatio-temporal features that combine motion magnitude, object class prediction, and saturation. Motion magnitude measures how much motion there is in a video. Object class prediction provides information about an object in a video. Saturation measures the colorfulness of a video. Convolutional neural networks (CNNs) are incorporated for object class prediction. The sum of the normalized features per shot are ranked in descending order, and the summary is determined by the highest ranking shots.

Categories:
11 Views

The quantization parameter (QP) value and Lagrangian multiplier (λ) are the key factors for an encoder to achieve the trade-off between visual quality and bit-rate in next generation multimedia communications. In this work, we propose a novel temporal redundancy ratio (TRR) model to determinate hierarchical QPs.

Categories:
17 Views

A sound way to localize occluded people is to project the foregrounds from multiple camera views to a reference view by homographies and find the foreground intersections. However, this may give rise to phantoms due to foreground intersections from different people. In this paper, each intersection region is warped back to the original camera view and is associated with a candidate box of the average pedestrians’ size at that location. Then a joint occupancy likelihood is calculated for each intersection region.

Categories:
6 Views

State-of-the-art video coding techniques employ block-based
illumination compensation to improve coding efficiency. In
this work, we propose a Lifting-based Illumination Adaptive
Transform (LIAT) to exploit temporal redundancy among
frames that have illumination variations, such as the frames
of low frame rate video or multi-view video. LIAT employs
a mesh-based spatially affine model to represent illumination
variations between two frames. In LIAT, transformed frames
are jointly compressed, together with illumination information,

Categories:
10 Views

Computational aesthetics have seen much progress in recent years with the increasing popularity of deep learning methods. In this paper, we present two approaches that leverage on the benefits of using Global Average Pooling (GAP) to reduce the complexity of deep convolutional neural networks. The first model fine-tunes a standard CNN with a newly introduced GAP layer. The second approach extracts global and local CNN codes by reducing the dimensionality of convolution layers with individual GAP operations.

Categories:
24 Views

As image tampering becomes ever more sophisticated and commonplace, the need for image forensics algorithms that can accurately and quickly detect forgeries grows. In this paper, we revisit the ideas of image querying and retrieval to provide clues to better localize forgeries. We propose a method to perform large-scale image forensics on the order of one million images using the help of an image search algorithm and database to gather contextual clues as to where tampering may have taken place.

Categories:
4 Views

Departing from traditional digital forensics modeling, which seeks to analyze single objects in isolation, multimedia phylogeny analyzes the evolutionary processes that influence digital objects and collections over time. One of its integral pieces is provenance filtering, which consists of searching a potentially large pool of objects for the most related ones with respect to a given query, in terms of possible ancestors (donors or contributors) and descendants.

Categories:
5 Views

Pages