- Transducers
- Spatial and Multichannel Audio
- Source Separation and Signal Enhancement
- Room Acoustics and Acoustic System Modeling
- Network Audio
- Audio for Multimedia
- Audio Processing Systems
- Audio Coding
- Audio Analysis and Synthesis
- Active Noise Control
- Auditory Modeling and Hearing Aids
- Bioacoustics and Medical Acoustics
- Music Signal Processing
- Loudspeaker and Microphone Array Signal Processing
- Echo Cancellation
- Content-Based Audio Processing
- Read more about MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK
- Log in to post comments
- Categories:
- Read more about Whole Sentence Neural Language Model
- Log in to post comments
Recurrent neural networks have become increasingly popular for the task of language modeling achieving impressive gains in state-of-the-art speech recognition and natural language processing (NLP) tasks. Recurrent models exploit word dependencies over a much longer context window (as retained by the history states) than what is feasible with n-gram language models.
- Categories:
- Read more about Signboard Saliency Detection in Street Videos
- Log in to post comments
- Categories:
- Read more about Acoustic Reflector Localization and Classification
- Log in to post comments
The process of understanding acoustic properties of environments is important for several applications, such as spatial audio, augmented reality and source separation. In this paper, multichannel room impulse responses are recorded and transformed into their direction of arrival (DOA)-time domain, by employing a superdirective beamformer. This domain can be represented as a 2D image. Hence, a novel image processing method is proposed to analyze the DOA-time domain, and estimate the reflection times of arrival and DOAs. The main acoustically reflective objects are then localized.
- Categories:
- Read more about CLASSIFICATION OF CORALS IN REFLECTANCE AND FLUORESCENCE IMAGES USING CONVOLUTIONAL NEURAL NETWORK REPRESENTATIONS
- Log in to post comments
Coral species, with complex morphology and ambiguous boundaries, pose a great challenge for automated classification. CNN activations, which are extracted from fully connected layers of deep networks (FC features), have been successfully used as powerful universal representations in many visual tasks. In this paper, we investigate the transferability and combined performance of FC features and CONV features (extracted
- Categories:
- Read more about Determined Blind Source Separation via Proximal Splitting Algorithm
- Log in to post comments
The state-of-the-art algorithms of determined blind source separation (BSS) methods based on the independent component analysis
- Categories:
- Read more about Phase Corrected Total Variation for Audio Signals
- Log in to post comments
In optimization-based signal processing, the so-called prior term models the desired signal, and therefore its design is the key factor to achieve a good performance. For audio signals, the time-directional total variation applied to a spectrogram in combination with phase correction has been proposed recently to model sinusoidal components of the signal. Although it is a promising prior, its applicability might be restricted to some extent because of the mismatch of the assumption to the signal.
- Categories:
- Read more about A NOVEL LSTM-BASED SPEECH PREPROCESSOR FOR SPEAKER DIARIZATION IN REALISTIC MISMATCH CONDITIONS
- Log in to post comments
- Categories:
- Read more about On Sequential Random Distortion Testing of Non-Stationary Processes
- Log in to post comments
Random distortion testing (RDT) addresses the problem of testing whether or not a random signal deviates by more than a specified tolerance from a fixed value. The test is non-parametric in the sense that the distribution of the signal under each hypothesis is assumed to be unknown. The signal is observed in independent and identically distributed (i.i.d) additive noise. The need to control the probabilities of false alarm and missed de- tection while reducing the number of samples required to make a decision leads to the SeqRDT approach.
- Categories:
- Read more about AUTOMATIC TEMPORAL SEGMENTATION OF HAND MOVEMENTS FOR HAND POSITIONS RECOGNITION IN FRENCH CUED SPEECH
- Log in to post comments
In the context of Cued Speech (CS) recognition, the recognition
of lips and hand movements is a key task. As we know, a good
temporal segmentation is necessary for the supervised recog-
nition system. However, lips and hand streams cannot share
the same temporal segmentation since they are not synchro-
nized. In this work, we propose a hand preceding model to
predict temporal segmentations of hand movements automati-
cally by exploring the relationship between hand preceding time
- Categories: