Sorry, you need to enable JavaScript to visit this website.

Camera pose estimation plays a crucial role in stitching overlapped images captured by a camera to achieve a broad view of interest. In this paper, we propose a robust camera pose estimation approach to stitching images of a large 3D surface with known geometry. In particular, given a collection of images, we first construct a relative pose matrix estimation of all image pairs from the collection, where each entry of the matrix is calculated by solving a perspective-n-point(PnP) problem over the corresponding pair of images.

Categories:
8 Views

Compressive covariance sampling (CCS) theory aims to recover the covariance matrix (CM) of a signal, instead of the signal itself, from a reduced set of random linear projections. Although several theoretical works demonstrate the CCS theory's advantages in compressive spectral imaging tasks, a real optical implementation has not been proposed.

Categories:
5 Views

In Virtual Reality (VR) systems, head mounted displays (HMDs) are widely used to present VR contents. When displaying immersive (360 degree video) scenes, greater challenges arise due to limitations of computing power, frame rate, and transmission bandwidth. To address these problems, a variety of foveated video compression and streaming methods have been proposed, which seek to exploit the nonuniform sampling density of the retinal photoreceptors and ganglion cells, which decreases rapidly with increasing eccentricity.

Categories:
8 Views

Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features.

Categories:
64 Views

Performing sound source separation and visual object segmentation jointly in naturally occurring videos is a notoriously difficult task, especially in the absence of annotated data. In this study, we leverage the concurrency between audio and visual modalities in an attempt to solve the joint audio-visual segmentation problem in a self-supervised manner. Human beings interact with the physical world through a few sensory systems such as vision, auditory, movement, etc. The usefulness of the interplay of such systems lies in the concept of degeneracy.

Categories:
48 Views

The automatic diagnosis of lung infections using chest computed
tomography (CT) scans has been recently obtained remarkable significance,
particularly during the COVID-19 pandemic that the early
diagnosis of the disease is of utmost importance. In addition, infection
diagnosis is the main building block of most automated diagnostic/
prognostic frameworks. Recently, due to the devastating effects
of the radiation on the body caused by the CT scan, there has been
a surge in acquiring low and ultra-low-dose CT scans instead of the

Categories:
15 Views

Recent attempts show that factorizing 3D convolutional filters into separate spatial and temporal components brings impressive improvement in action recognition. However, traditional temporal convolution operating along the temporal dimension will aggregate unrelated features, since the feature maps of fast-moving objects have shifted spatial positions. In this paper, we propose a novel and effective Multi-Directional convolution (MDConv), which extracts features along different spatial-temporal orientations.

Categories:
21 Views

Pages