quantization of neural networks

CORRELATION-AWARE JOINT PRUNING-QUANTIZATION USING GRAPH NEURAL NETWORKS

Read more about CORRELATION-AWARE JOINT PRUNING-QUANTIZATION USING GRAPH NEURAL NETWORKS
Log in to post comments

Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing techniques overlook a critical aspect: layer correlations. These correlations are essential as they expose redundant computations across layers, and leveraging them facilitates efficient design space exploration. This study employs Graph Neural Networks (GNN) to learn these inter-layer relationships, thereby optimizing the pruning-quantization strategy for the targeted model.

ICIP2024_v03.pdf

ICIP2024_v03.pdf (118)

Categories:: Other

32 Views

Fully Integerized End-to-End Learned Image Compression

Read more about Fully Integerized End-to-End Learned Image Compression
Log in to post comments

End-to-end learned image compression (LIC) has become promising alternatives for lossy image compression. However, deployments of LIC models are restricted, due to excessive network parameters and high computational complexity. Existing LIC models realized throughout with integer networks are significantly degraded in rate-distortion (R-D) performance. In this paper, we propose a novel fully integerized model for LIC that leverages channel-wise weight and activation quantization.

pre.pptx

pre.pptx (215)

Categories:: Other

76 Views

Mixed-precision Deep Neural Network Quantization With Multiple Compression Rates

Read more about Mixed-precision Deep Neural Network Quantization With Multiple Compression Rates
Log in to post comments

Quantization of one deep neural network to multiple compression rates (precisions) has been recently considered for flexible deployments in real-world scenarios. However, existing methods for network quantization under multiple compression rates leverage fixed-precision bit-width allocation or heuristically search for mixed-precision strategy and cannot well balance efficiency and performance.

DCC_pre-WD.pptx

DCC_pre-WD.pptx (143)

Categories:: Other
Other

25 Views

SRQ: Self-reference quantization scheme for lightweight neural network

Read more about SRQ: Self-reference quantization scheme for lightweight neural network
Log in to post comments

Lightweight neural network (LNN) nowadays plays a vital role in embedded applications with limited resources. Quantized LNN with a low bit precision is an effective solution, which further reduces the computational and memory resource requirements. However, it is still challenging to avoid the significant accuracy degradation compared with the heavy neural network due to its numerical approximation and lower redundancy. In this paper, we propose a novel robustness-aware self-reference quantization scheme for LNN (SRQ), as Fig.

dcc2021.pdf

PDF of the ppt (287)

Categories:: Other

43 Views