Documents
Presentation Slides
Low Rank Based End-to-End Deep Neural Network Compression
- Citation Author(s):
- Submitted by:
- swayambhoo jain
- Last updated:
- 1 March 2021 - 2:04pm
- Document Type:
- Presentation Slides
- Document Year:
- 2021
- Event:
- Presenters:
- Shahab Hamidi-Rad
- Categories:
- Log in to post comments
Deep neural networks (DNNs), despite their performance on a wide variety of tasks, are still out of reach for many applications as they require significant computational resources. In this paper, we present a low-rank based end-to-end deep neural network compression frame- work with the goal of enabling DNNs performance to computationally constrained devices. The proposed framework includes techniques for low-rank based structural approximation, quantization and lossless arithmetic coding. Many of these techniques have been accepted in the MPEG working draft on compressed Neural Network Representations. We demonstrate the efficacy of the proposed framework via extensive experiments on a variety of DNNs for various tasks considered in this standardization activity. These techniques provide impres- sive performance on DNNs used in ImageNet Large-Scale Visual Recognition Challenge by compressing VGG16 by 61×, ResNet50 by almost 15×, and MobileNetV2 by almost 7×.