Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Recent studies suggest that Resolution Enhancement Compression (REC) can provide significant improvements in terms of imaging quality over Classical Pulsed (CP) ultrasonic imaging techniques, by employing frequency and amplitude modulated transmitted signals. However the performance of coded excitations methods degrades drastically deeper into the tissue where the attenuation effects become more significant. In this work, a technique that allows overcoming the effects of attenuation on REC imaging is proposed (REC-Opt).

Categories:
7 Views

The acoustic-to-word model based on the connectionist temporal classification (CTC) criterion was shown as a natural end-to-end (E2E) model directly targeting words as output units. However, the word-based CTC model suffers from the out-of-vocabulary (OOV) issue as it can only model limited number of words in the output layer and maps all the remaining words into an OOV output node. Hence, such a word-based CTC model can only recognize the frequent words modeled by the network output nodes.

Categories:
8 Views

In this study, we develop the keyword spotting (KWS) and acoustic model (AM) components in a far-field speaker system. Specifically, we use teacher-student (T/S) learning to adapt a close-talk well-trained production AM to far-field by using parallel close-talk and simulated far-field data. We also use T/S learning to compress a large-size KWS model into a small-size one to fit the device computational cost. Without the need of transcription, T/S learning well utilizes untranscribed data to boost the model performance in both the AM adaptation and KWS model compression.

Categories:
6 Views

In this paper, we focus on the problem of content-based retrieval for
audio, which aims to retrieve all semantically similar audio recordings
for a given audio clip query. This problem is similar to the
problem of query by example of audio, which aims to retrieve media
samples from a database, which are similar to the user-provided example.
We propose a novel approach which encodes the audio into
a vector representation using Siamese Neural Networks. The goal is
to obtain an encoding similar for files belonging to the same audio

Categories:
4 Views

The current Document Image Quality Assessment (DIQA) algorithms directly relate the Optical Character Recognition (OCR) accuracies with the quality of the document to build supervised learning frameworks. This direct correlation has two major limitations: (a) OCR may be affected by factors independent of the quality of the capture and (b) it cannot account for blur variations within an image. An alternate possibility is to quantify the quality of capture using human judgement, however, it is subjective and prone to error.

Categories:
3 Views

Pages