Sorry, you need to enable JavaScript to visit this website.

Although transformer-based models have improved the state-of-the-art in speech recognition, it is still not well understood what information from the speech signal these models encode in their latent representations. This study investigates the potential of using labelled data (TIMIT) to probe wav2vec 2.0 embeddings for insights into the encoding and visualisation of speech signal information at phone boundaries. Our experiment involves training probing models to detect phone-specific articulatory features in the hidden layers based on IPA classifications.

Categories:
20 Views

To alleviate the negative impacts of noisy labels, most of the noisy label learning (NLL) methods dynamically divide the training data into two types, “clean samples” and “noisy samples”, in the training process. However, the conventional selection of clean samples heavily depends on the features learned in the early stages of training, making it difficult to guarantee the cleanliness of the selected samples in scenarios where the noise ratio is high.

Categories:
50 Views

Although BP decoders are efficient and provide significant
performance for classical low-density parity-check (LDPC)
codes, they will suffer a degradation in performance for quantum
LDPC (QLDPC) codes due to the limitations in the quantum
field. In this paper, we propose a posterior adjustment of
either a single qubit or multiple qubits within binary belief
propagation (BP). The adjustment process changes the posterior
likelihood ratio for one or multiple qubits according to the

Categories:
53 Views

Efficient utilization of large-scale biobank data is crucial for inferring the genetic basis of disease and predicting health outcomes from the DNA. Yet we lack efficient, accurate methods that scale to data where electronic health records are linked to whole genome sequence information. To address this issue, our paper develops a new algorithmic paradigm based on Approximate Message Passing (AMP), which is specifically tailored for genomic prediction and association testing.

Categories:
13 Views

Loihi 2 is a fully event-based neuromorphic processor that supports a wide range of synaptic connectivity configurations and temporal neuron dynamics. Loihi 2's temporal and event-based paradigm is naturally well-suited to processing data from an event-based sensor, such as a Dynamic Vision Sensor (DVS) or a Silicon Cochlea. However, this begs the question: How general are signal processing efficiency gains on Loihi 2 versus conventional computer architectures?

Categories:
25 Views

We consider the problem of classifying a pair of Synthetic Aperture Radar (SAR) images by proposing an explainable and frugal algorithm that integrates a set of divergences. The approach relies on a statistical framework that takes standard probability distributions into account for modelling SAR data. Then, by learning a combination of parameterized Renyi divergences and their parameters from the data, we are able to classify the pair of images with fewer parameters than regular machine learning approaches while also allowing an interpretation of the results related to the priors used.

Categories:
19 Views

Empathetic response generation requires perceiving and un- derstanding the user’s emotion to deliver a suitable response. However, existing models generally remain oblivious of an interlocutor’s persona, which has been shown to play a vital role in expressing appropriate empathy to different users. To address this problem, we propose a novel Transformer-based architecture that incorporates retrieval-augmented prompt learning to generate persona-aware empathetic responses.

Categories:
35 Views

This study introduces an angle-based micro-Doppler analysis using Frequency Modulated Continuous Wave (FMCW) radar tailored for axle-based vehicle classification. The novel approach exploits the signal angle of arrival to separate incoming signals and noise from distinct targets. This is done by analysing the phase difference of a dual antenna radar system based on the time-frequency representation of the radar beat signal. Vehicles driving side by side can now be discriminated. Multipath signals and clutter are more easily identified and filtered out.

Categories:
11 Views

Recent healthcare applications of natural language processing involve multi-label classification of health records using the International Classification of Diseases (ICD). While prior research highlights intricate text models and explores external knowledge like hierarchical ICD ontology, fewer studies integrate code relationships from whole datasets to enhance ICD coding accuracy. This study presents a modular approach, sequentially combining graph-based integration of ICD code co-occurrence with a hard-coded hierarchical enriched text representation drawn from the ICD ontology.

Categories:
15 Views

Smart home device control is a difficult task if the instruction is abstract and the planner needs to adjust dynamic home configurations. With the increasing capability of Large Language Model (LLM), they have become the customary model for zero-shot planning tasks similar to smart home device control. Although cloud supported large language models can seamlessly do device control tasks, on-device small language models show limited capabilities. In this work, we show how we can leverage large language models to enable small language models for device control task.

Categories:
16 Views

Pages