Sorry, you need to enable JavaScript to visit this website.

We study the problem of direction of arrival estimation for arbitrary antenna
arrays. We formulate it as a continuous line spectral estimation problem and solve it under
a sparsity prior without any gridding assumptions. Moreover, we incorporate the
array's beampattern in form of the Effective Aperture Distribution Function
(EADF), which allows to use arbitrary (synthetic as well as measured) antenna
arrays. This generalizes known atomic norm based grid-free DOA estimation methods (that

Categories:
31 Views

In this paper, we collect a trivial event speech database that involves 75 speakers and 6 types of events, and report preliminary speaker recognition results on this database, by both human listeners and machines. Particularly, the deep feature learning technique recently proposed by our group is utilized to analyze and recognize the trivial events, leading to acceptable equal error rates (EERs) ranging from 5% to 15% despite the extremely short durations (0.2-0.5 seconds) of these events. Comparing different types of events, ‘hmm’ seems more speaker discriminative.

Categories:
7 Views

In this paper, we collect a trivial event speech database that involves 75 speakers and 6 types of events, and report preliminary speaker recognition results on this database, by both human listeners and machines. Particularly, the deep feature learning technique recently proposed by our group is utilized to analyze and recognize the trivial events, leading to acceptable equal error rates (EERs) ranging from 5% to 15% despite the extremely short durations (0.2-0.5 seconds) of these events. Comparing different types of events, ‘hmm’ seems more speaker discriminative.

Categories:
19 Views

As we are surrounded by an increased number of mobile devices equipped with wireless links and multiple microphones, e.g., smartphones, tablets, laptops and hearing aids, using them collaboratively for acoustic processing is a promising platform for emerging applications.

Categories:
8 Views

Deep learning is still not a very common tool in speaker verification field. We study deep convolutional neural network performance in the text-prompted speaker verification task. The prompted passphrase is segmented into word states — i.e. digits — to test each digit utterance separately. We train a single high-level feature extractor for all states and use cosine similarity metric for scoring. The key feature of our network is the Max-Feature-Map activation function, which acts as an embedded feature selector.

Categories:
13 Views

When using optimization methods with matrix variables in signal processing and machine learning, it is customary to assume some low-rank prior on the targeted solution. Nonnegative matrix factorization of spectrograms is a case in point in audio signal processing. However, this low-rank prior is not straightforwardly related to complex matrices obtained from a short-time Fourier -- or discrete Gabor -- transform (STFT), which is generally defined from and studied based on a modulation operator and a translation operator applied to a so-called window.

Categories:
15 Views

Pages