Sorry, you need to enable JavaScript to visit this website.

Low Rank Based End-to-End Deep Neural Network Compression

Citation Author(s):
Swayambhoo Jain, Shahab Hamidi-Rad, Fabien Racape
Submitted by:
swayambhoo jain
Last updated:
1 March 2021 - 2:04pm
Document Type:
Presentation Slides
Document Year:
2021
Event:
Presenters:
Shahab Hamidi-Rad
Categories:
 

Deep neural networks (DNNs), despite their performance on a wide variety of tasks, are still out of reach for many applications as they require significant computational resources. In this paper, we present a low-rank based end-to-end deep neural network compression frame- work with the goal of enabling DNNs performance to computationally constrained devices. The proposed framework includes techniques for low-rank based structural approximation, quantization and lossless arithmetic coding. Many of these techniques have been accepted in the MPEG working draft on compressed Neural Network Representations. We demonstrate the efficacy of the proposed framework via extensive experiments on a variety of DNNs for various tasks considered in this standardization activity. These techniques provide impres- sive performance on DNNs used in ImageNet Large-Scale Visual Recognition Challenge by compressing VGG16 by 61×, ResNet50 by almost 15×, and MobileNetV2 by almost 7×.

up
0 users have voted: