Skip to content

Commit 5f5bc8e

Browse files
committed
Update on "Autoquant"
Summary: Adding autoquantization functionality, using hte do_quant api we can test kernel speeds and pick the best quantization type (or no quantization) for each layer. Test Plan: python test/test.py -k "autoquant" also tested on SAM and SDXL pytorch-labs/segment-anything-fast#114 HDCharles/sdxl-fast@8d9942a Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D55103983](https://our.internmc.facebook.com/intern/diff/D55103983) [ghstack-poisoned]
2 parents 55cce68 + 886ac77 commit 5f5bc8e

File tree

2 files changed

+6
-0
lines changed

2 files changed

+6
-0
lines changed

__init__.py

Whitespace-only changes.

test/test.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -894,6 +894,12 @@ def test_aq_int8_dynamic_quant_subclass(self):
894894
AQInt8DynamicallyQuantizedLinearWeight.from_float, 35, test_dtype
895895
)
896896

897+
def test_aq_int8_weight_only_quant_subclass(self):
898+
for test_dtype in [torch.float32, torch.float16, torch.bfloat16]:
899+
self._test_lin_weight_subclass_impl(
900+
AQInt8DynamicallyQuantizedLinearWeight.from_float, 35, test_dtype
901+
)
902+
897903
def test_aq_int8_weight_only_quant_subclass(self):
898904
for test_dtype in [torch.float32, torch.float16, torch.bfloat16]:
899905
self._test_lin_weight_subclass_impl(

0 commit comments

Comments
 (0)