You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
* Added AWQConfig that takes a base config and made corresponding changes
in other parts of the flow
Test Plan:
TODO
Reviewers:
Subscribers:
Tasks:
Tags:
Configuration for quantizing linear layers when passed into quantize_()
217
+
218
+
Args:
219
+
quant_dtype: The data type of the quantized weights. Currently only torch.uint4 is intended to be used but can be used with torch.uint1 -> torch.uint8
220
+
`layout`: layout type for quantized tensor, default is `TensorCoreTiledLayout(inner_k_tiles=8)`
221
+
group_size: Quantization granularity. Use -1 for channel wise quantization
222
+
weight_quant_fn: The quantization function to be used, which takes in the weight and returns the quantized weight. If None, then affine uint4 quantization is used
223
+
set_inductor_config: if True, adjusts `torchinductor` settings to recommended values.
224
+
"""
225
+
226
+
base_config: AOBaseConfig
227
+
step: str="convert"
228
+
example_input_shape: Optional[List[int]] =None
229
+
scale_search_space_size: int=20
230
+
set_inductor_config: bool=True
231
+
232
+
def__post_init__(self):
233
+
OPTIONS= ["calibrate", "convert", "load"]
234
+
assertself.stepinOPTIONS, f"Only {OPTIONS} are supported, got {self.step}"
0 commit comments