Sorry, you need to enable JavaScript to visit this website.

An Empirical Bayes Approach to Partially Labeled and Shuffled Data Sets

Citation Author(s):
Alex Dytso, H. Vincent Poor
Submitted by:
Alex Dytso
Last updated:
13 February 2020 - 3:22pm
Document Type:
Research Manuscript
Document Year:
2020
Event:
Presenters:
Alex Dytso
Paper Code:
2692
 

This work outlines a method for an application of empirical Bayes in the setting of semi-supervised learning. That is, we consider a scenario in which the training set is partially or entirely unlabeled. In addition to the missing labels, we also consider a scenario where the available training data might be shuffled (i.e., the features and labels are not matched).

Specifically, we propose to train model-based empirical Bayes separately on the set of features and the set of labels and combine/mix the two models based on the proportion of unlabeled pairs. The method then can be used to recover the missing labels (i.e., create pseudo-labels) of the data set and, in addition, if the data is shuffled, recover the correct permutation of the data. The technique is evaluated for a multivariate Gaussian model and is shown to consistently outperform a maximum likelihood approach. Moreover, the procedure is shown to be a consistent estimator for a multivariate Gaussian model with an arbitrary (non-degenerate) covariance matrix.

up
0 users have voted:

Comments

n/A