ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.
- Read more about ONLINE SINGING VOICE SEPARATION USING A RECURRENT ONE-DIMENSIONAL U-NET TRAINED WITH DEEP FEATURE LOSSES
- Log in to post comments
This paper proposes an online approach to the singing voice separation problem. Based on a combination of one-dimensional convolutional layers along the frequency axis and recurrent layers to enforce temporal coherency, state-of-the-art performance is achieved. The concept of using deep features in the loss function to guide training and improve the model’s performance is also investigated.
- Categories:
- Read more about Efficient Nonlinear Acoustic Echo Cancellation by Dual-stage Multi-channel Kalman Filtering
- Log in to post comments
Mobile devices for hands-free speech communication often show significant nonlinear distortion in the sound emitted by their loudspeakers. Therefore, conventional linear echo cancellation is not sufficient for maintaining a high conversation quality. In this work we propose a nonlinear echo canceller that uses two serially cascaded adaptive filters to compensate for the nonlinear and linear echo. We show that a stable operation of the cascaded structure is achieved by using the multi-channel Kalman algorithm in the frequency domain with filtered-x references.
- Categories:
- Read more about eep Latent Factor Model for Predicting Drug Target Interactions
- Log in to post comments
In drug target interaction (DTI) the interactions of some (a subset) drugs on some (a subset) targets are known. The goal is to predict the interactions of all drugs on all targets. One approach is to formulate this as a matrix completion problem, where the matrix of interactions having drugs along the rows and targets along the columns is partially filled. So far standard matrix completion approaches such as nuclear norm minimization and matrix factorization have been used to address the problem.
- Categories:
- Read more about A Subband Energy Modification Method for Elevation Control in Median Plane
- Log in to post comments
Elevation perception is crucial for binaural reproduction. A recent study proposed an elevation control method by modifying the energy of HRTFs in each auditory scale subband, such as the ERB and Mel subband. However, this subband division is designed based on auditory excitation patterns and may not be consistent with the elevation localization cues. To this end, this study proposes a novel subband division strategy which emphasizes the physiological information involved in elevation localization based on a statistical analysis of the HRTF.
- Categories:
- Read more about Guided-spatio-temporal filtering for extracting sound from optically measured images containing occluding objects
- Log in to post comments
Recent development of optical interferometry enables us to measure sound without placing any device inside the sound field. In particular, parallel phase-shifting interferometry (PPSI) has realized advanced measurement of refractive index of air. Its novel application investigated very recently is simultaneous visualization of flow and sound, which had been difficult until PPSI enabled high-speed and accurate measurement several years ago. However, for understanding aerodynamic sound, separation of air flow and sound is necessary since they are mixed up in the observed video.
- Categories:
- Read more about Learning Motion Disfluencies for Automatic Sign Language Segmentation
- Log in to post comments
We introduce a novel technique for the automatic detection of word boundaries within continuous sentence expressions in Japanese Sign Language from three-dimensional body joint positions. First, the flow of signed sentence data within a temporal neighborhood is determined utilizing the spatial correlations between line segments of inter-joint pairs. Next, a frame-wise binary random forest classifier is trained to distinguish word and non-word frame content based on the extracted spatio-temporal features.
Poster.pdf
- Categories:
- Read more about REVISITING HIDDEN MARKOV MODELS FOR SPEECH EMOTION RECOGNITION
- Log in to post comments
- Categories:
Color Guided Depth image denoising often suffers from the texture coping from the color image as well as the blurry effect at the depth discontinuities. Motivated by this, we propose an optimized color-guided filter for depth image denoising from different types of noises. This is a new framework that helps to mitigate the texture coping and enhance the depth discontinuities, especially in heavy noises. This framework consists of two parts namely depth driven color flattening model and patch synthesis-based Markov random field model.
- Categories:
- Read more about ANOMALY IMAGING FOR STRUCTURAL HEALTH MONITORING EXPLOITING CLUSTERED SPARSITY
- Log in to post comments
We present a new tomography-based anomaly mapping algorithm for composite structures. The system consists of an array of piezoelectric transducers which sequentially excites the structure and collects the resulting waveform at the remaining transducers. Anomaly indices computed from the sensor waveforms are fed as input to the mapping algorithm. The output of the algorithm is a color map indicating the outline of damage on the structure when present.
- Categories:
- Read more about SPEAKER AGNOSTIC FOREGROUND SPEECH DETECTION FROM AUDIO RECORDINGS IN WORKPLACE SETTINGS FROM WEARABLE RECORDERS
- Log in to post comments
Audio-signal acquisition as part of wearable sensing adds an important dimension for applications such as understanding human behaviors. As part of a large study on work place behaviors, we collected audio data from individual hospital staff using custom wearable recorders. The audio features collected were limited to preserve privacy of the interactions in the hospital. A first step towards audio processing is to identify the foreground speech of the person wearing the audio badge.
- Categories: