- Read more about CORRELATION-AWARE JOINT PRUNING-QUANTIZATION USING GRAPH NEURAL NETWORKS
- Log in to post comments
Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing techniques overlook a critical aspect: layer correlations. These correlations are essential as they expose redundant computations across layers, and leveraging them facilitates efficient design space exploration. This study employs Graph Neural Networks (GNN) to learn these inter-layer relationships, thereby optimizing the pruning-quantization strategy for the targeted model.
- Categories:
End-to-end learned image compression (LIC) has become promising alternatives for lossy image compression. However, deployments of LIC models are restricted, due to excessive network parameters and high computational complexity. Existing LIC models realized throughout with integer networks are significantly degraded in rate-distortion (R-D) performance. In this paper, we propose a novel fully integerized model for LIC that leverages channel-wise weight and activation quantization.
- Categories:
- Read more about Mixed-precision Deep Neural Network Quantization With Multiple Compression Rates
- Log in to post comments
Quantization of one deep neural network to multiple compression rates (precisions) has been recently considered for flexible deployments in real-world scenarios. However, existing methods for network quantization under multiple compression rates leverage fixed-precision bit-width allocation or heuristically search for mixed-precision strategy and cannot well balance efficiency and performance.
- Categories:
- Read more about SRQ: Self-reference quantization scheme for lightweight neural network
- Log in to post comments
Lightweight neural network (LNN) nowadays plays a vital role in embedded applications with limited resources. Quantized LNN with a low bit precision is an effective solution, which further reduces the computational and memory resource requirements. However, it is still challenging to avoid the significant accuracy degradation compared with the heavy neural network due to its numerical approximation and lower redundancy. In this paper, we propose a novel robustness-aware self-reference quantization scheme for LNN (SRQ), as Fig.
dcc2021.pdf
- Categories: