Sorry, you need to enable JavaScript to visit this website.

We propose Shift R-CNN, a hybrid model for monocular 3D object detection, which combines deep learning with the power of geometry. We adapt a Faster R-CNN network for regressing initial 2D and 3D object properties and combine it with a least squares solution for the inverse 2D to 3D geometric mapping problem, using the camera projection matrix. The closed-form solution of the mathematical system, along with the initial output of the adapted Faster R-CNN are then passed through a final ShiftNet network that refines the result using our newly proposed Volume Displacement Loss.

Categories:
143 Views

Cloud segmentation is a vital task in applications that utilize satellite imagery. A common obstacle in using deep learning-based methods for this task is the insufficient number of images with their annotated ground truths. This work presents a content-aware unpaired image-to-image translation algorithm. It generates synthetic images with different land cover types from original images while preserving the locations and the intensity values of the cloud pixels. Therefore, no manual annotation of ground truth in these images is required.

Categories:
44 Views

Unsupervised object discovery in images involves uncovering recurring patterns that define objects and discriminates them against the background. This is more challenging than image clustering as the size and the location of the objects are not known: this adds additional degrees of freedom and increases the problem complexity. In this work, we propose StampNet, a novel autoencoding neural network that localizes shapes (objects) over a simple background in images and categorizes them simultaneously.

Categories:
27 Views

In-Loop filter is a key part in High Efficiency Video Coding(HEVC) which effectively removes the compression artifacts.Recently, many newly proposed methods combine residual learning and dense connection to construct a deeper network for better in-loop filtering performance. However,the long-term dependency between blocks is neglected, and information usually passes between blocks only after dimension compression.

Categories:
27 Views

A quantitative understanding of complex biological systems such as tissues requires reconstructing the structure of the different components of the system. Fluorescence microscopy provides the means to visualize simultaneously several tissue components. However, it can be time consuming and is limited by the number of fluorescent markers that can be used. In this study, we describe a toolbox of algorithms based on convolutional neural networks for the prediction of 3D tissue structures by learning features embedded within single-marker images.

Categories:
19 Views

Recent works have shown the vulnerability of deep convolu-tional neural network (DCNN) to adversarial examples withmalicious perturbations. In particular, Black-Box attackswithout information of parameter and architectures of thetarget models are feared as realistic threats. To address thisproblem, we propose a method using an ensemble of mod-els trained by color-quantized data with loss maximization.Color-quantization can allow the trained models to focuson learning conspicuous spatial features to enhance the ro-bustness of DCNNs to adversarial examples.

Categories:
38 Views

Studies on generalization performance of machine learning algorithms under the scope of information theory suggest that compressed representations can guarantee good generalization, inspiring many compression-based regularization methods. In this paper, we introduce REVE, a new regularization scheme. Noting that compressing the representation can be sub-optimal, our first contribution is to identify a variable that is directly responsible for the final prediction. Our method aims at compressing the class conditioned entropy of this latter variable.

Categories:
33 Views

This paper proposed a modified YOLOv3 which has an extra object depth prediction module for obstacle detection and avoidance. We use a pre-processed KITTI dataset to train the proposed, unified model for (i) object detection and (ii) depth prediction and use the AirSim flight simulator to generate synthetic aerial images to verify that our model can be applied in different data domains.

Categories:
291 Views

Pages