Sorry, you need to enable JavaScript to visit this website.

Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing techniques overlook a critical aspect: layer correlations. These correlations are essential as they expose redundant computations across layers, and leveraging them facilitates efficient design space exploration. This study employs Graph Neural Networks (GNN) to learn these inter-layer relationships, thereby optimizing the pruning-quantization strategy for the targeted model.

Categories:
15 Views

End-to-end learned image compression (LIC) has become promising alternatives for lossy image compression. However, deployments of LIC models are restricted, due to excessive network parameters and high computational complexity. Existing LIC models realized throughout with integer networks are significantly degraded in rate-distortion (R-D) performance. In this paper, we propose a novel fully integerized model for LIC that leverages channel-wise weight and activation quantization.

Categories:
65 Views

Quantization of one deep neural network to multiple compression rates (precisions) has been recently considered for flexible deployments in real-world scenarios. However, existing methods for network quantization under multiple compression rates leverage fixed-precision bit-width allocation or heuristically search for mixed-precision strategy and cannot well balance efficiency and performance.

Categories:
23 Views

Lightweight neural network (LNN) nowadays plays a vital role in embedded applications with limited resources. Quantized LNN with a low bit precision is an effective solution, which further reduces the computational and memory resource requirements. However, it is still challenging to avoid the significant accuracy degradation compared with the heavy neural network due to its numerical approximation and lower redundancy. In this paper, we propose a novel robustness-aware self-reference quantization scheme for LNN (SRQ), as Fig.

Categories:
41 Views