
- Read more about SPEAKER CHARACTERIZATION USING TDNN-LSTM BASED SPEAKER EMBEDDING
- Log in to post comments
In this paper we propose speaker characterization using time delay neural networks and long short-term memory neural networks (TDNN-LSTM) speaker embedding. Three types of front-end feature extraction are investigated to find good features for speaker embedding. Three kinds of data augmentation are used to increase the amount and diversity of the training data. The proposed methods are evaluated with the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) tasks.
- Categories:

- Categories:

360 camera has recently become popular since it can capture the whole 360 scene. A large number of related applications have been springing up. In this paper, We propose a deep learning based object detector that can be applied directly on 360 images. The proposed detector is based on modifications of the faster RCNN model. Three modification schemes are proposed here, including (1) distortion data augmentation, (2) introducing muilti-kernel layers for improving accuracy for distorted object detection, and (3) adding position information into the model for learning spatial information.
ICASSP_poster300.pdf

- Categories:

- Read more about NEURAL LATTICE DECODERS
- Log in to post comments
Lattice decoders constructed with neural networks are presented.
Firstly, we show how the fundamental parallelotope
is used as a compact set for the approximation by a neural lattice
decoder. Secondly, we introduce the notion of Voronoi reduced
lattice basis. As a consequence, a first optimal neural
lattice decoder is built from Boolean equations and the facets
of the Voronoi cell. This decoder needs no learning. Finally,
we present two neural decoders with learning. It is shown
- Categories:

- Read more about FACE AGING AS IMAGE-TO-IMAGE TRANSLATION USING SHARED-LATENT SPACE GENERATIVE ADVERSARIAL NETWORKS
- 1 comment
- Log in to post comments
Here, a novel approach is proposed to generate age progression (i.e., future looks) and regression (i.e., previous looks) of persons based on their face images. The proposed method addresses face aging as an unsupervised image-to-image translation problem where the goal is to translate a face image belonging to an age class to an image of
- Categories:

- Read more about A Generalizable Model for Seizure Prediction based on Deep Learning using CNN-LSTM Architecture
- Log in to post comments
- Categories:

- Read more about Three-dimentional Convolution Neural Network based Encrypted Traffic Classifier for Wireless Communications
- Log in to post comments
Network traffic classification, working by associating traffic flows with specific categories or intruders, plays an important role in network management and security. For network traffic classification in wireless communications, the major challenge is encrypted data. Researchers are usually not authorized to get inner information of the traffic flows, and have to analyze traffic features. Machine learning algorithms are widely used as classifiers, and represent learning makes feature extraction more accurate by avoiding manual operation.
Ran-ppt.pdf

- Categories:

- Read more about On Regression Losses for Depth Estimation
- Log in to post comments
Depth estimation from a single monocular image has reached great performances thanks to recent works based on deep networks. However, as various choices of losses, architectures and experimental conditions are proposed in the literature, it is difficult to establish their respective influence on the performances. In this paper we propose an in-depth study of various losses and experimental conditions for depth regression, on \nyu dataset. From this study we propose a new network for depth estimation combining an encoder-decoder architecture with an adversarial loss.
- Categories:

- Read more about Automatic Optic Disk and Cup Segmentation of Fundus Images Using Deep Learning
- Log in to post comments
• To automatically segment optic disk (OD) and cup regions in fundus images to derive clinical parameters, such as, cup-to-disk diameter ratio (CDR), to assist glaucoma diagnosis. An eye fundus camera is non-invasive and low-cost,
enabling the screening of a large number of patients quickly.
• Discuss various strategies on how to leverage multiple doctor annotations and prioritize pixels belonging to different regions during network optimization.
• Evaluate proposed approaches on the Drishti-GS dataset.
- Categories: