- Read more about End-to-end Keyword Spotting using Neural Architecture Search and Quantization
- Log in to post comments
This paper introduces neural architecture search (NAS) for the automatic discovery of end-to-end keyword spotting (KWS) models in limited resource environments. We employ a differentiable NAS approach to optimize the structure of convolutional neural networks (CNNs) operating on raw audio waveforms. After a suitable KWS model is found with NAS, we conduct quantization of weights and activations to reduce the memory footprint. We conduct extensive experiments on the Google speech commands dataset.
icassp_2022_poster.pdf
- Categories:
- Read more about AN INVESTIGATION OF THE EFFECTIVENESS OF PHASE FOR AUDIO CLASSIFICATION
- Log in to post comments
While log-amplitude mel-spectrogram has widely been used as the feature representation for processing speech based on deep learning, the effectiveness of another aspect of speech spectrum, i.e., phase information, was shown recently for tasks such as speech enhancement and source separation. In this study, we extensively investigated the effectiveness of including phase information of signals for eight audio classification tasks. We constructed a learnable front-end that can compute the phase and its derivatives based on a time-frequency representation with mel-like frequency axis.
- Categories:
- Read more about Multitask Gaussian Process with Hierarchical Latent Interactions
- Log in to post comments
- Categories:
- Read more about DOMAIN-INVARIANT REPRESENTATION LEARNING FROM EEG WITH PRIVATE ENCODERS
- Log in to post comments
Deep learning based electroencephalography (EEG) signal processing methods are known to suffer from poor test-time generalization due to the changes in data distribution. This becomes a more challenging problem when privacy-preserving representation learning is of interest such as in clinical settings. To that end, we propose a multi-source learning architecture where we extract domain-invariant representations from dataset-specific private encoders.
- Categories:
- Read more about HOLISTIC SEMI-SUPERVISED APPROACHES FOR EEG REPRESENTATION LEARNING
- 1 comment
- Log in to post comments
- Categories:
- Read more about Feature Fusion Ensemble Architecture With Active Learning For Microscopic Blood Smear Analysis
- Log in to post comments
The blood smear analysis provides vital information and forms the basis to diagnose most of the diseases. With recent developments, deep learning methods can analyze the microscopic blood sample using image processing and classification tasks with less human effort and increased accuracy.
- Categories:
In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show that the proposed graph filtering methodology has the effect of asymptotically reducing intra-class variance, while maintaining the mean.
- Categories:
Generative Adversarial Networks (GANs) have been used recently for anomaly detection from images, where the anomaly scores are obtained by comparing the global difference between the input and generated image. However, the anomalies often appear in local areas of an image scene, and ignoring such information can lead to unreliable detection of anomalies.
- Categories:
- Read more about Multiscale IoU: A Metric for Evaluation of Salient Object Detection with Fine Structures
- Log in to post comments
General-purpose object-detection algorithms often dismiss the fine structure of detected objects. This can be traced back to how their proposed regions are evaluated. Our goal is to renegotiate the trade-off between the generality of these algorithms and their coarse detections. In this work, we present a new metric that is a marriage of a popular evaluation metric, namely Intersection over Union (IoU), and a geometrical concept, called fractal dimension. We propose Multiscale IoU (MIoU) which allows comparison between the detected and ground-truth regions at multiple resolution levels.
- Categories:
- Read more about MICRO-EXPRESSION RECOGNITION BASED ON VIDEO MOTION MAGNIFICATION AND PRE-TRAINED NEURAL NETWORK
- Log in to post comments
This paper investigates the effects of using video motion magnification methods based on amplitude and phase, respectively, to amplify small facial movements. We hypothesise that this approach will assist in the micro-expression recognition task. To this end, we apply the pre-trained VGGFace2 model with its excellent facial feature capturing ability to transfer learn the magnified micro-expression movement, then encode the spatial information and decode the spatial and temporal information by Bi-LSTM model.
- Categories: