Sorry, you need to enable JavaScript to visit this website.

ICIP 2021 - The International Conference on Image Processing (ICIP), sponsored by the IEEE Signal Processing Society, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied image and video processing. ICIP has been held annually since 1994, brings together leading engineers and scientists in image and video processing from around the world. Visit website.

Autonomous Vehicles promise to transport people in a safer, accessible, and even efficient way. Nowadays, real-world autonomous vehicles are build by large teams from big companies with a tremendous amount of engineering effort. Deep Reinforcement Learning can be used instead, without domain experts, to learn end-to-end driving policies. Here, we combine Curriculum Learning with deep reinforcement learning, in order to learn without any prior domain knowledge, an end-to-end competitive driving policy for the CARLA autonomous driving simulator.


In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show that the proposed graph filtering methodology has the effect of asymptotically reducing intra-class variance, while maintaining the mean.


Video delivery over the Internet has been becoming a commodity in recent years, owing to the widespread use of Dynamic Adaptive Streaming over HTTP(DASH). The DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of segments. This paper focuses on segmenting video into multiple shots for encoding in Video on Demand (VoD) HTTP Adaptive Streaming (HAS) applications.


In many of the existing alpha matting implementations, an intermediate representation called a trimap needs to be created manually. To automate the process, we propose a generic neural network for trimap generation based on saliency map detection. Our model multi-modally learns a saliency map and a trimap simultaneously. Because of this structure, the network focuses on reducing the error of the trimap especially within the areas with high salience.


No-reference (NR) image sharpness assessment is an important issue for image quality assessment and algorithm performance evaluation. Many objective NR sharpness assessment metrics have been proposed which are often intended to be strongly associated with the human visual system (HVS). However, recent studies show that common sharpness assessment indicators may misjudge the degree of blurring for images with shallow depth of field that are often used to highlight the main subject in the view.


Quarter sampling is a novel sensor design that allows for an acquisition of higher resolution images without increasing the number of pixels. When being used for video data, one out of four pixels is measured in each frame. Effectively, this leads to a non-regular spatio-temporal sub-sampling. Compared to purely spatial or temporal sub-sampling, this allows for an increased reconstruction quality, as aliasing artifacts can be reduced. For the fast reconstruction of such sensor data with a fixed mask, recursive variant of frequency selective reconstruction (FSR) was proposed.