Low Rank Based End-to-End Deep Neural Network Compression

Deep neural networks (DNNs), despite their performance on a wide variety of tasks, are still out of reach for many applications as they require significant computational resources. In this paper, we present a low-rank based end-to-end deep neural network compression frame- work with the goal of enabling DNNs performance to computationally constrained devices. The proposed framework includes techniques for low-rank based structural approximation, quantization and lossless arithmetic coding. Many of these techniques have been accepted in the MPEG working draft on compressed Neural Network Representations. We demonstrate the efficacy of the proposed framework via extensive experiments on a variety of DNNs for various tasks considered in this standardization activity. These techniques provide impres- sive performance on DNNs used in ImageNet Large-Scale Visual Recognition Challenge by compressing VGG16 by 61×, ResNet50 by almost 15×, and MobileNetV2 by almost 7×.

DccPresentation.pdf

DccPresentation.pdf (452)

Thumbs Up

CITE

Documents

Presentation Slides

Low Rank Based End-to-End Deep Neural Network Compression

DccPresentation.pdf

QUESTIONS?