Sorry, you need to enable JavaScript to visit this website.

Segmenting a document image into text-lines and words finds applications in many research areas of DIA(Document Image Analysis) such as OCR, Word Spotting, and document retrieval. However, carrying out segmentation operation directly in the compressed document images is still an unexplored and challenging research area. Since JPEG is most widely accepted compression algorithm, this research paper attempts to segment a JPEG compressed printed text document image into text-lines and words, without fully decompressing the image.

Categories:
54 Views

Active speaker detection (ASD) and virtual cinematography (VC) can significantly improve the remote user experience of a video conference by automatically panning, tilting and zooming of a video conferencing camera: users subjectively rate an expert video cinematographer’s video significantly higher than unedited video. We describe a new automated ASD and VC that performs within 0.3 MOS of an expert cinematographer based on subjective ratings with a 1-5 scale.

Categories:
32 Views

Reconstructing a signal corrupted by impulsive noise is of high importance in several applications, including impulsive noise removal from images, audios and videos, and separating texts from images. Investigating this problem, in this paper we propose a new method to reconstruct a noise-corrupted signal where both signal and noise are sparse but in different domains. We apply our algorithm for impulsive noise (Salt-and-Pepper Noise (SPN) and Random-Valued Impulsive Noise (RVIN) removal from images and compare our results with other notable algorithms in the literature.

Categories:
45 Views

In this work we explore an overcomplete representation of
multiview imagery for the purpose of compression. We
present a rate-distortion (R-D) driven approach to decompose
multiview datasets into two additive parts which can
be interpreted as being the diffuse and specular components.
We apply different transforms to each component such that
the compressibility of input data is improved. We describe
a framework which performs the R-D optimized separation
in a registered domain to avoid the complexity of warping

Categories:
7 Views

In order to predict where humans look in a 3D immersive en- vironment, saliency can be computed using either 3D saliency models or view-based approaches (2D projection). In fact, building a 3D complete model is still a challenging task that is not investigated enough in the research field while 2D imag- ing approaches have been extensively studied and have shown solid performances.

Categories:
36 Views

Analysis of hand skeleton data can be used to understand patterns in manipulation and assembly tasks. This paper introduces a graphbased representation of hand skeleton data and proposes a method to perform unsupervised temporal segmentation of a sequence of subtasks in order to evaluate the efficiency of an assembly task. We explore the properties of different choices of hand graphs and their spectral decomposition. A comparative performance of these graphs is presented in the context of complex activity segmentation.

Categories:
6 Views

Pages