Sorry, you need to enable JavaScript to visit this website.

We introduce a new Nonnegative Matrix Factorization (NMF) model called Nonnegative Unimodal Matrix Factorization (NuMF), which adds on top of NMF the unimodal condition on the columns of the basis matrix. NuMF finds applications for example in analytical chemistry. We propose a simple but naive brute-force heuristics strategy based on accelerated projected gradient. It is then improved by using multi-grid for which we prove that the restriction operator preserves the unimodality.


One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-path segmentation technique, where the size of each segment remains unchanged throughout all layers. In contrast, our key finding is that multi-granularity features are essential for enhancing contextual modeling and computational efficiency. We introduce a self-attentive network with a novel sandglass-shape, namely Sandglasset, which advances the state-of-the-art (SOTA) SS performance at significantly smaller model size and computational cost.


Hand-crafted spatial features (e.g., inter-channel phase difference, IPD) play a fundamental role in recent deep learning based multi-channel speech separation (MCSS) methods. However, these manually designed spatial features are hard to incorporate into the end-to-end optimized MCSS framework. In this work, we propose an integrated architecture for learning spatial features directly from the multi-channel speech waveforms within an end-to-end speech separation framework. In this architecture, time-domain filters spanning signal channels are trained to perform adaptive spatial filtering.


Functional connectivity analysis by detecting neuronal coactivation in the brain can be efficiently done using Resting State Functional Magnetic Resonance Imaging (rs-fMRI) analysis. Most of the existing research in this area employ correlation-based group averaging strategies of spatial smoothing and temporal normalization of fMRI scans, whose reliability of results heavily depends on the voxel resolution of fMRI scan as well as scanning duration. Scanning period from 5 to 11 minutes has been chosen by most of the studies while estimating the connectivity of brain networks.


Deep Clustering (DC) and Deep Attractor Networks (DANs) are a data-driven way to monaural blind source separation.
Both approaches provide astonishing single channel performance but have not yet been generalized to block-online processing.
When separating speech in a continuous stream with a block-online algorithm, it needs to be determined in each block which of the output streams belongs to whom.
In this contribution we solve this block permutation problem by introducing an additional speaker identification embedding to the DAN model structure.


Robust speech processing in multi-talker environments requires effective speech separation. Recent deep learning systems have made significant progress toward solving this problem, yet it remains challenging particularly in real-time, short latency applications. Most methods attempt to construct a mask for each source in time-frequency representation of the mixture signal which is not necessarily an optimal representation for speech separation.


The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. The key concept in our approach is that source estimates of an optimal separator should be indistinguishable from real source signals.


In this article, we propose a Bounded Component Analysis (BCA) approach for the separation of the convolutive mixtures of sparse sources. The corresponding algorithm is derived from a geometric objective function defined over a completely deterministic setting. Therefore, it is applicable to sources which can be independent or dependent in both space and time dimensions. We show that all global optima of the proposed objective are perfect separators. We also provide numerical examples to illustrate the performance of the algorithm.