Post-training Activation Pruning algorithm #2683

yujiepan-work · 2024-05-16T08:00:23Z

Changes

Add torch backend implementation for Post-training Activation Pruning algorithm.

The current interface is

from nncf import IgnoredScope
from nncf.experimental.torch.sparsify_activations import TargetScope

nncf.experimental.torch.sparsify_activations.sparsify_activations(
  torch_model, calibration_dataset, 
  target_sparsity_by_scope={
    TargetScope(patterns=[".*linear.*"]): 0.5,
  },
  ignored_scope=IgnoredScope(),
)

See model accuracy check at #2683 (comment)

Tests

Added unit tests at tests/torch/experimental/sparsify_activations
Added conformance tests at tests/post_training/experimental/sparsify_activations

alexsu52

Thanks for the contribution!

Could you add to the description any results by accuracy and performance?

yujiepan-work · 2024-06-07T07:30:46Z

Thanks for the contribution!

Could you add to the description any results by accuracy and performance?

Thanks for your reply! We will add this by the beginning of ww24.

nikita-savelyevv

Great work with the implementation, Yujie! Also, the tests are very thorough. Mostly minor comments from my side.

nncf/experimental/torch/sparsify_activations/sparsify_activations_impl.py

nncf/experimental/torch/sparsify_activations/torch_backend.py

tests/torch/experimental/sparsify_activations/test_algo.py

nncf/experimental/torch/sparsify_activations/__init__.py

tests/post_training/experimental/sparsify_activations/test_sparsify_activations_conformance.py

tests/torch/experimental/sparsify_activations/test_algo.py

codecov · 2024-06-07T17:14:30Z

Codecov Report

Attention: Patch coverage is 0% with 187 lines in your changes missing coverage. Please review.

Project coverage is 62.10%. Comparing base (3d11e8a) to head (3c9a7b7).
Report is 49 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff              @@
##           develop    #2683       +/-   ##
============================================
+ Coverage    47.13%   62.10%   +14.96%     
============================================
  Files          495      486        -9     
  Lines        45986    46630      +644     
============================================
+ Hits         21675    28959     +7284     
+ Misses       24311    17671     -6640

Files	Coverage Δ
...xperimental/torch/sparsify_activations/__init__.py	`0.00% <0.00%> (ø)`
.../sparsify_activations/sparsify_activations_impl.py	`0.00% <0.00%> (ø)`
...mental/torch/sparsify_activations/torch_backend.py	`0.00% <0.00%> (ø)`

... and 223 files with indirect coverage changes

Flag	Coverage Δ
COMMON	`41.93% <ø> (-1.64%)`	⬇️
ONNX	`34.04% <0.00%> (-0.71%)`	⬇️
OPENVINO	`40.80% <0.00%> (+0.87%)`	⬆️
TENSORFLOW	`29.27% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`87.56% <ø> (+17.89%)`	⬆️
torch	`32.82% <0.00%> (-0.37%)`	⬇️
tensorflow	`93.26% <ø> (+93.26%)`	⬆️
onnx	`93.06% <ø> (+<0.01%)`	⬆️
openvino	`94.51% <ø> (+0.36%)`	⬆️
ptq	`82.04% <ø> (+2.64%)`	⬆️

yujiepan-work · 2024-06-11T15:27:43Z

Model accuracy check:

Llama-2-7b-hf

	model-level sparsity	avg. zero-shot accuracy (8 tasks)	wikitext perplexity
FP16 baseline	-	71.15	8.79
FP16 sparse	25% (up/gate32%+down52%)	70.73 (-0.59%)	9.01
INT8 sparse	25% (up/gate32%+down52%)	70.88 (-0.38%)	9.01

Mixtral-8x7B-Instruct-v0.1

	model-level sparsity	avg. zero-shot accuracy (8 tasks)	wikitext perplexity
FP16 baseline	-	80.08	6.22
FP16 sparse	40% (up/gate42%+down52%)	79.70 (-0.48%)	6.56

Tasks: 'arc_easy', 'arc_challenge', 'boolq', "piqa", 'lambada_openai', 'winogrande', 'sciq', 'hellaswag'

yujiepan-work · 2024-06-25T14:13:13Z

Remaining issues at this moment:

README markdown at the algo folder
Documentation: example of using the interface
target_sparsity_by_scope interface design

nikita-savelyevv

Another round of review. Awaiting addition of activation sparsity test pipeline to our CI to check the conformance test results there.

nncf/experimental/torch/sparsify_activations/sparsify_activations_impl.py

nncf/experimental/torch/sparsify_activations/target_scope.py

nncf/experimental/torch/sparsify_activations/torch_backend.py

nikita-savelyevv · 2024-07-01T09:13:08Z

@daniil-lyakhov could you please review the PyTorch backend part of the implementation, specifically PTSparsifyActivationsAlgoBackend class?

yujiepan-work · 2024-07-10T06:56:56Z

@daniil-lyakhov Hi Daniil, thanks for reviewing this PR. Are there any openings that we need to change?

daniil-lyakhov

LGTM 👍

nikita-savelyevv · 2024-07-10T11:51:53Z

Conformance test build (id=12) has passed. The time it takes is about half an hour.

The only thing left is some documentation of the method.

nikita-savelyevv

Looks good! I can suggest only some minor tweaks

nncf/experimental/torch/sparsify_activations/ActivationSparsity.md

yujiepan-work · 2024-07-16T08:38:00Z

There is a failed test but seems not caused by this PR:

FAILED tests/common/graph/test_dot_file_rw.py::test_colons_are_replaced_in_written_dot_file - AssertionError: assert False
 +  where False = <function cmp at 0x7f52758f80d0>(PosixPath('/tmp/pytest-of-runner/pytest-0/popen-gw0/test_colons_are_replaced_in_wr0/graph.dot'), PosixPath('/home/runner/work/nncf/nncf/tests/common/data/reference_graphs/dot_rw_reference.dot'))
 +    where <function cmp at 0x7f52758f80d0> = filecmp.cmp

Update: the failed test can pass after retry

* remove ignore_scope argument sparsify_activations call in example snippet

* Add notes about experimental features and in-development of runtime kernel * Elaborate on target support only on Linear layers for LLMs

yujiepan-work · 2024-07-16T15:20:55Z

Thank you all for the reviews! Since we have resolved all the issues, I wonder whether this PR is ready to be merged. If there are any further changes needed, we would be glad to deal with them. 🙂

nikita-savelyevv · 2024-07-17T11:13:52Z

@alexsu52, please take a look. The PR should be ready for merging.

cc @AlexKoff88 @MaximProshin

alexsu52

Thanks for the contribution!

github-actions bot added the experimental label May 16, 2024

yujiepan-work force-pushed the 24h1/sparse-activation/nncf-pr branch 2 times, most recently from f2a2cbc to c6c753d Compare May 23, 2024 08:20

github-actions bot added the NNCF PT Pull requests that updates NNCF PyTorch label May 29, 2024

github-actions bot added the NNCF PTQ Pull requests that updates NNCF PTQ label Jun 6, 2024

yujiepan-work marked this pull request as ready for review June 7, 2024 01:15

yujiepan-work requested a review from a team as a code owner June 7, 2024 01:15

alexsu52 reviewed Jun 7, 2024

View reviewed changes

nikita-savelyevv requested changes Jun 7, 2024

View reviewed changes

yujiepan-work changed the title ~~Statistical-Conditioned Activation Pruning (SCAP) for LLMs~~ Post-training Activation Pruning algorithm Jun 11, 2024

yujiepan-work force-pushed the 24h1/sparse-activation/nncf-pr branch 3 times, most recently from f9af328 to 61a0fc8 Compare June 25, 2024 05:55

nikita-savelyevv reviewed Jul 1, 2024

View reviewed changes

yujiepan-work force-pushed the 24h1/sparse-activation/nncf-pr branch 2 times, most recently from 7c81fc5 to 17792bd Compare July 8, 2024 18:50

daniil-lyakhov approved these changes Jul 10, 2024

View reviewed changes

yujiepan-work force-pushed the 24h1/sparse-activation/nncf-pr branch from e1f7af2 to 63a87e8 Compare July 10, 2024 10:03

github-actions bot added the documentation Improvements or additions to documentation label Jul 12, 2024

yujiepan-work force-pushed the 24h1/sparse-activation/nncf-pr branch from 5cb146a to f3fcea8 Compare July 12, 2024 16:26

nikita-savelyevv reviewed Jul 13, 2024

View reviewed changes

first commit

73f5e66

yujiepan-work and others added 23 commits July 16, 2024 16:39

use higher precision to calculate running_threshold

9439b82

delete apply_sparsifiers as it is not needed

a90dc18

make freeze a property

53f6b26

enhanace reproducibility

8a5fc4a

use fp16 on cuda

6e35851

fix int8+sparse export

9171f0c

update metric

b652785

update ref metric

c67753f

Initial documentation of sparsify_activations algorithm

303a66a

Revise ActivationSparsity.md

e90f39a

update readme

4b8eb09

style fix

238d6eb

update main readme

cb39cfd

fix equation

893947f

update readme

93844a9

documentation update

35e279c

update arxiv link

6491ba8

mention dejavu for acceleration example

aa67251

Revise ActivationSparsity.md

b1ea929

* remove ignore_scope argument sparsify_activations call in example snippet

Revise ActivationSparsity.md

d8a0ef6

* Add notes about experimental features and in-development of runtime kernel * Elaborate on target support only on Linear layers for LLMs

style fix

98bd3ba

fix compress_weights name

0613dce

minor fix for "L"inear and parentheses for citation

5f04275

yujiepan-work force-pushed the 24h1/sparse-activation/nncf-pr branch from 007c04a to 5f04275 Compare July 16, 2024 08:40

nikita-savelyevv approved these changes Jul 16, 2024

View reviewed changes

alexsu52 approved these changes Jul 19, 2024

View reviewed changes

alexsu52 merged commit 49e9820 into openvinotoolkit:develop Jul 19, 2024
12 checks passed

nikita-savelyevv mentioned this pull request Aug 28, 2024

Activation Sparsity OV backend #2924

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post-training Activation Pruning algorithm #2683

Post-training Activation Pruning algorithm #2683

yujiepan-work commented May 16, 2024 •

edited

Loading

alexsu52 left a comment

yujiepan-work commented Jun 7, 2024 •

edited

Loading

nikita-savelyevv left a comment

codecov bot commented Jun 7, 2024 •

edited

Loading

yujiepan-work commented Jun 11, 2024 •

edited

Loading

yujiepan-work commented Jun 25, 2024 •

edited

Loading

nikita-savelyevv left a comment

nikita-savelyevv commented Jul 1, 2024

yujiepan-work commented Jul 10, 2024

daniil-lyakhov left a comment

nikita-savelyevv commented Jul 10, 2024 •

edited

Loading

nikita-savelyevv left a comment

yujiepan-work commented Jul 16, 2024 •

edited

Loading

yujiepan-work commented Jul 16, 2024

nikita-savelyevv commented Jul 17, 2024

alexsu52 left a comment

Post-training Activation Pruning algorithm #2683

Post-training Activation Pruning algorithm #2683

Conversation

yujiepan-work commented May 16, 2024 • edited Loading

Changes

Tests

alexsu52 left a comment

Choose a reason for hiding this comment

yujiepan-work commented Jun 7, 2024 • edited Loading

nikita-savelyevv left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 7, 2024 • edited Loading

Codecov Report

yujiepan-work commented Jun 11, 2024 • edited Loading

yujiepan-work commented Jun 25, 2024 • edited Loading

nikita-savelyevv left a comment

Choose a reason for hiding this comment

nikita-savelyevv commented Jul 1, 2024

yujiepan-work commented Jul 10, 2024

daniil-lyakhov left a comment

Choose a reason for hiding this comment

nikita-savelyevv commented Jul 10, 2024 • edited Loading

nikita-savelyevv left a comment

Choose a reason for hiding this comment

yujiepan-work commented Jul 16, 2024 • edited Loading

yujiepan-work commented Jul 16, 2024

nikita-savelyevv commented Jul 17, 2024

alexsu52 left a comment

Choose a reason for hiding this comment

yujiepan-work commented May 16, 2024 •

edited

Loading

yujiepan-work commented Jun 7, 2024 •

edited

Loading

codecov bot commented Jun 7, 2024 •

edited

Loading

yujiepan-work commented Jun 11, 2024 •

edited

Loading

yujiepan-work commented Jun 25, 2024 •

edited

Loading

nikita-savelyevv commented Jul 10, 2024 •

edited

Loading

yujiepan-work commented Jul 16, 2024 •

edited

Loading