Skip to content

Commit

Permalink
Update on "Autoquant"
Browse files Browse the repository at this point in the history
Summary: Adding autoquantization functionality, using hte do_quant api
we can test kernel speeds and pick the best quantization type (or no
quantization) for each layer.

Test Plan: python test/test.py -k "autoquant"

also tested on SAM and SDXL
pytorch-labs/segment-anything-fast#114
HDCharles/sdxl-fast@8d9942a

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D55103983](https://our.internmc.facebook.com/intern/diff/D55103983)

[ghstack-poisoned]
  • Loading branch information
HDCharles committed Mar 19, 2024
2 parents bc2deb7 + 490c7c1 commit 4aae2a3
Showing 1 changed file with 0 additions and 6 deletions.
6 changes: 0 additions & 6 deletions test/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -894,12 +894,6 @@ def test_aq_int8_dynamic_quant_subclass(self):
AQInt8DynamicallyQuantizedLinearWeight.from_float, 35, test_dtype
)

def test_aq_int8_weight_only_quant_subclass(self):
for test_dtype in [torch.float32, torch.float16, torch.bfloat16]:
self._test_lin_weight_subclass_impl(
AQInt8DynamicallyQuantizedLinearWeight.from_float, 35, test_dtype
)

def test_aq_int8_weight_only_quant_subclass(self):
for test_dtype in [torch.float32, torch.float16, torch.bfloat16]:
self._test_lin_weight_subclass_impl(
Expand Down

0 comments on commit 4aae2a3

Please sign in to comment.