Sorry, you need to enable JavaScript to visit this website.

In this paper we propose speaker characterization using time delay neural networks and long short-term memory neural networks (TDNN-LSTM) speaker embedding. Three types of front-end feature extraction are investigated to find good features for speaker embedding. Three kinds of data augmentation are used to increase the amount and diversity of the training data. The proposed methods are evaluated with the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) tasks.

Categories:
94 Views

360 camera has recently become popular since it can capture the whole 360 scene. A large number of related applications have been springing up. In this paper, We propose a deep learning based object detector that can be applied directly on 360 images. The proposed detector is based on modifications of the faster RCNN model. Three modification schemes are proposed here, including (1) distortion data augmentation, (2) introducing muilti-kernel layers for improving accuracy for distorted object detection, and (3) adding position information into the model for learning spatial information.

Categories:
112 Views

Lattice decoders constructed with neural networks are presented.
Firstly, we show how the fundamental parallelotope
is used as a compact set for the approximation by a neural lattice
decoder. Secondly, we introduce the notion of Voronoi reduced
lattice basis. As a consequence, a first optimal neural
lattice decoder is built from Boolean equations and the facets
of the Voronoi cell. This decoder needs no learning. Finally,
we present two neural decoders with learning. It is shown

Categories:
3 Views

Here, a novel approach is proposed to generate age progression (i.e., future looks) and regression (i.e., previous looks) of persons based on their face images. The proposed method addresses face aging as an unsupervised image-to-image translation problem where the goal is to translate a face image belonging to an age class to an image of

Categories:
158 Views

Network traffic classification, working by associating traffic flows with specific categories or intruders, plays an important role in network management and security. For network traffic classification in wireless communications, the major challenge is encrypted data. Researchers are usually not authorized to get inner information of the traffic flows, and have to analyze traffic features. Machine learning algorithms are widely used as classifiers, and represent learning makes feature extraction more accurate by avoiding manual operation.

Categories:
146 Views

Depth estimation from a single monocular image has reached great performances thanks to recent works based on deep networks. However, as various choices of losses, architectures and experimental conditions are proposed in the literature, it is difficult to establish their respective influence on the performances. In this paper we propose an in-depth study of various losses and experimental conditions for depth regression, on \nyu dataset. From this study we propose a new network for depth estimation combining an encoder-decoder architecture with an adversarial loss.

Categories:
29 Views

• To automatically segment optic disk (OD) and cup regions in fundus images to derive clinical parameters, such as, cup-to-disk diameter ratio (CDR), to assist glaucoma diagnosis. An eye fundus camera is non-invasive and low-cost,
enabling the screening of a large number of patients quickly.

• Discuss various strategies on how to leverage multiple doctor annotations and prioritize pixels belonging to different regions during network optimization.

• Evaluate proposed approaches on the Drishti-GS dataset.

Categories:
75 Views

Pages