Sorry, you need to enable JavaScript to visit this website.

Unsupervised Learning (UL) models are a class of Machine Learning (ML) which concerns with reducing dimensionality, data factorization, disentangling and learning the representations among the data. The UL models gain their popularity due to their abilities to learn without any predefined label, and they are able to reduce the noise and redundancy among the data samples.

Categories:
56 Views

Spatial-temporal graph convolutional networks (ST-GCN) have achieved outstanding performances on human action recognition, however, it might be less superior on a two-person interaction recognition (TPIR) task due to the relationship of each skeleton is not considered. In this study, we present an improvement of the ST-GCN model that focused on TPIR by employing the pairwise adjacency matrix to capture the relationship of person-person skeletons (ST-GCN-PAM). To validate the effectiveness of the proposed ST-GCN-PAM model on TPIR, experiments were conducted on NTU RGB+D 120.

Categories:
101 Views

This paper presents an approach to perform human activity recognition in videos through the employment of a deep recurrent network, taking as inputs appearance and optical flow information. Our method proposes a novel architecture named BubbleNET, which is based on a recurrent layer dispersed into several modules (referred to as bubbles) along with an attention mechanism based on squeeze-and-excitation strategy, responsible to modulate each bubble contribution.

Categories:
5 Views

State-of-the-art hearing aids (HA) are limited in recognizing acoustic environments. Much effort is spent on research to improve listening experience for HA users in every acoustic situation. There is, however, no dedicated public database to train acoustic environment recognition algorithms with a specific focus on HA applications accounting for their requirements. Existing acoustic scene classification databases are inappropriate for HA signal processing.

Categories:
197 Views

As handwriting input becomes more prevalent, the large symbol inventory required to support Chinese handwriting recognition poses unique challenges. This paper describes how the Apple deep learning recognition system can accurately handle up to 30,000 Chinese characters while running in real-time across a range of mobile devices.

Categories:
23 Views

We propose a general projection-free metric learning framework, where the minimization objective $\min_{\M \in \cS} Q(\M)$ is a convex differentiable function of the metric matrix $\M$, and $\M$ resides in the set $\cS$ of generalized graph Laplacian matrices for connected graphs with positive edge weights and node degrees.
Unlike low-rank metric matrices common in the literature, $\cS$ includes the important positive-diagonal-only matrices as a special case in the limit.

Categories:
26 Views

Epilepsy affects approximately 1% of the world’s population. Semiology of epileptic seizures contain major clinical signs to classify epilepsy syndromes currently evaluated by epileptologists by simple visual inspection of video. There is a necessity to create automatic and semiautomatic methods for seizure detection and classification to better support patient monitoring management and diagnostic decisions. One of the current promising approaches are the marker-less computer-vision techniques.

Categories:
24 Views

Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multi-oriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections.

Categories:
13 Views

Pages