Documents
Poster
Poster
MODELLING LARGE SCALE DATASETS USING PARTITIONING-BASED PCA
- Citation Author(s):
- Submitted by:
- Salaheddin Alakkari
- Last updated:
- 20 September 2019 - 12:08am
- Document Type:
- Poster
- Document Year:
- 2019
- Event:
- Paper Code:
- 3447
- Categories:
- Keywords:
- Log in to post comments
In this study, we propose an efficient approach for modelling and compressing large-scale datasets. The main idea is to subdivide each sample into smaller partitions where each partition constitutes a particular subset of attributes and then apply PCA to each partition separately. This simple approach enjoys several key advantages over the traditional holistic scheme in terms of reduced computational cost and enhanced reconstruction quality. We study two variants of this approach, namely, cell-based PCA for image datasets where samples are spatially divided into smaller blocks and the more general band-based PCA where attributes are partitioned based on their values distribution.