Sorry, you need to enable JavaScript to visit this website.

The use of deep networks to extract embeddings for speaker recognition has proven successfully. However, such embeddings are susceptible to performance degradation due to the mismatches among the training, enrollment, and test conditions. In this work, we propose an adversarial speaker verification (ASV) scheme to learn the condition-invariant deep embedding via adversarial multi-task training. In ASV, a speaker classification network and a condition identification network are jointly optimized to minimize the speaker classification loss and simultaneously mini-maximize the condition loss.


The teacher-student (T/S) learning has been shown to be effective for a variety of problems such as domain adaptation and model compression. One shortcoming of the T/S learning is that a teacher model, not always perfect, sporadically produces wrong guidance in form of posterior probabilities that misleads the student model towards a suboptimal performance.


In this paper, we study a 2D tomography problem for point source models with random unknown view angles. Rather than recovering the projection angles, we reconstruct the model through a set of rotation-invariant features that are estimated from the projection data. For a point source model, we show that these features reveal geometric information about the model such as the radial and pairwise distances. This establishes a connection between unknown view tomography and unassigned distance geometry problem (uDGP).


A novel maximum likelihood trajectory estimation algorithm for targets in mixed stationary/moving conditions is presented. The proposed approach is able to estimate position and velocity of the target over arbitrary complex trajectories, while explicitly taking into account the possibility of stop&go motion. Moreover, a novel trajectory reconstruction method based on the theory of Bezier curve is developed for online smoothing of the trajectory, which keeps the advantages of Bayesian smoothing while introducing only a fixed lag in the estimation process.


Increasingly, post-secondary instructors are incorporating innovative teaching practices into their classrooms to improve student learning outcomes. In order to assess the effect of these techniques, it is helpful to quantify the types of activity being conducted in the classroom. Unfortunately, self-reporting is unreliable and manual annotation is tedious and scales poorly.


Back-propagation (BP) is now a classic learning paradigm
whose source of supervision is exclusively from the external
(input/output) nodes. Consequently, BP is easily vulnerable
to curse-of-depth in (very) Deep Learning Networks
(DLNs). This prompts us to advocate Internal Neuron’s
Learnablility (INL) with (1)internal teacher labels (ITL); and
(2)internal optimization metrics (IOM) for evaluating hidden
layers/nodes. Conceptually, INL is a step beyond the notion
of Internal Neuron’s Explainablility (INE), championed by


Configuring hybrid precoders and combiners is the main challenge to be solved to operate at millimeter wave (mmWave) frequencies. The use of hybrid architectures imposse hardware constraints on the analog precoder that need to be carefully dealt with. In this paper, we develop hybrid precoders and combiners aiming at minimizing the Euclidean distance with respect to the approximate all-digital precoders and combiners maximizing the spectral efficiency under per-antenna power constraints.


We propose a training scheme to train neural network-based source separation algorithms from scratch when parallel clean data is unavailable. In particular, we demonstrate that an unsupervised spatial clustering algorithm is sufficient to guide the training of a deep clustering system. We argue that previous work on deep clustering requires strong supervision and elaborate on why this is a limitation.