Sorry, you need to enable JavaScript to visit this website.

Despite significant advances in recent years, the problem of image stitching still lacks a robust solution. Most of the feature based image stitching algorithms perform image alignment based on either homography-based transformation or content-preserving warping. Pairwise homography-based approach miserably fails to handle parallax whereas content-preserving warping approach does not preserve the structural property of the images. In this paper, we propose a nonlinear optimization to find out the global homographies using pairwise homography estimates and point correspondences.

Categories:
9 Views

End-to-end encryption challenges mobile network operators to assess the quality of the HTTP Adaptive Streaming (HAS), where the quality assessment is coarse-grained, e.g., detecting if there exist stalling during the whole playback. Targeting on this issue, this paper proposes an attention-based hybrid RNN-HMM model, which integrates HMM with attention mechanism to predict the player states. The model is trained and evaluated based on the download speed and player state sequences of encrypted video sessions collected from YouTube.

Categories:
8 Views

Video Object Tracking -VOT- in realistic scenarios is a difficult task. Image factors such as occlusion, clutter, confusion, object shape, and zooming, among others, have an impact on video tracker methods performance. While these conditions do affect trackers performance, there is not a clear distinction between the scene content challenges like occlusion and clutter, against challenges due to distortions generated by capture, compression, processing, and transmission of videos. This paper is concerned with the latter interpretation of quality as it affects VOT performance.

Categories:
34 Views

In this paper, we introduce a variation of a state-of-the-art real-time tracker (CFNet), which adds to the original algorithm robustness to target loss without a significant computational overhead. The new method is based on the assumption that the feature map can be used to estimate the tracking confidence more accurately.

Categories:
1 Views

Understanding videos of people speaking across international borders is hard as audiences from different demographies do not understand the language. Such speech videos are often supplemented with language subtitles. However, these hamper the viewing experience as the attention is shared. Simple audio dubbing in a different language makes the video appear unnatural due to unsynchronized lip motion. In this paper, we propose a system for automated cross-language lip synchronization for re-dubbed videos.

Categories:
9 Views

Ambisonics i.e., a full-sphere surround sound, is quintessential with 360° visual content to provide a realistic virtual reality (VR) experience. While 360° visual content capture gained a tremendous boost recently, the estimation of corresponding spatial sound is still challenging due to the required sound-field microphones or information about the sound-source locations. In this paper, we introduce a novel problem of generating Ambisonics in 360° videos using the audiovisual cue.

Categories:
14 Views

Color and depth information provided simultaneously in RGB-D images can be used to segment scenes into disjoint regions. In this paper, a graph-based segmentation method for RGB-D image is proposed, in which an adaptive data-driven combination of color- and normal-variation is presented to construct dissimilarity between two adjacent pixels and a novel region merging threshold exploiting normal information in adjacent regions is proposed to control the proceeding of the region merging.

Categories:
97 Views

With the development of augmented reality, the delivery and storage of 3D content have become an important research area. Among the proposals for point cloud compression collected by MPEG, Apple’s Test Model Category 2 (TMC2) achieves the highest quality for 3D sequences under a bitrate constraint. However, the TMC2 framework is not spatially scalable. In this paper, we add interpolation compo- nents which make TMC2 suitable for flexible resolution. We apply a patch-aware averaging filter to eliminate most outliers which result from the interpolation.

Categories:
34 Views

Pages