-
Notifications
You must be signed in to change notification settings - Fork 240
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[TorchFX] Weights Compression Support (#2891)
### Changes 1. Added weights compression implementation from template weights compression. 2. Modified graph builder for torch fx to include edge case where embedding node's weight was not being placed on the right port. 3. Updated the torch weights compression tests to include FX embedding metatype for reusability of some torch test functions in fx test. ### Reason for changes To support nncf.compress_weights() for Torch Fx models. ### Tests Added test at `tests/torch/fx/test_compress_weights.py` Reused the models and some tests from the torch implementation and included some extra checks such as the size of compressed model being lower than original model. ### Performance: tinyllama-1.1b-step-50k-105b Inference Speeds: - Torch Fx Compressed: 0.963s - Torch Fx Compiled with OV backend: 0.074s - Torch Fx, Compiled with OV backend and compressed: 0.04s - OV FP32: 0.079s - OV int8: 0.039s ### Constraints Currently only supports Torch FX representations extracted using the `torch._export.capture_pre_autograd_graph()`. #2987 outlines the request to support weights compression for FX models extracted using `torch.export.export`
- Loading branch information
Showing
18 changed files
with
951 additions
and
72 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.