Exponent Sharing in Generalized Matrix Multiplications

To reduce the memory requirement of neural network models, we proposed exponent sharing [1] when weights are stored in a floating point precision. The concept is depicted in the following figure. With the proposed floating point storage method every float becomes a combination of sign, index and mantissa.

Matrix multiplications are at the core of neural networks. This repository contains the code of Generalized Matrix Multiplications (GEMMs) with and without exponent sharing. Any pre-trained model can benefit from exponent sharing in terms of storage requirements. The layerwise exponent sharing in a model saves the memory by at least 9% for no accuracy loss when weights are in IEEE Float32 format. One such example is shared here.

References

[1] P. Kashikar, S. Sinha and A. K. Verma, "Exploiting Weight Statistics for Compressed Neural Network Implementation on Hardware," 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021, pp. 1-4, doi: 10.1109/AICAS51828.2021.9458581.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
expo_shared GEMM		expo_shared GEMM
original_GEMM		original_GEMM
Expo_sharing.png		Expo_sharing.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exponent Sharing in Generalized Matrix Multiplications

References

About

Releases

Packages

Languages

prachikashikar/Expo-Share-In-GEMM

Folders and files

Latest commit

History

Repository files navigation

Exponent Sharing in Generalized Matrix Multiplications

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages