Sparse Subspace Clustering with Missing and Corrupted Data

In many settings, we can accurately model high-dimensional data as lying in a union of subspaces. Subspace clustering is the process of inferring the subspaces and determining which point belongs to each subspace. In this paper we study a ro- bust variant of sparse subspace clustering (SSC). While SSC is well-understood when there is little or no noise, less is known about SSC under significant noise or missing en- tries. We establish clustering guarantees in the presence of corrupted or missing entries. We give explicit bounds on the amount of additive noise and the number of missing entries the algorithm can tolerate, both in deterministic settings and in a random generative model. Our analysis shows that this method can tolerate up to O(n/d) missing entries per column instead of O(n/d2) as previous analyses show, where we have d-dimensional subspaces in an n-dimensional ambient space. Moreover, our method and analysis work by simply filling in the missing entries with zeros and do not need to know the location of missing entries.

sparse_subpsace_clustering_with_missing_and_corrupted_data.pdf

sparse_subpsace_clustering_with_missing_and_corrupted_data.pdf (861)

Thumbs Up

CITE

Documents

Presentation Slides

Sparse Subspace Clustering with Missing and Corrupted Data

sparse_subpsace_clustering_with_missing_and_corrupted_data.pdf

QUESTIONS?