Skip to content

Commit 4ddf049

Browse files
authored
Fixed bug for 16a4w ptq (#12167)
Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues Differential Revision: D77671468
1 parent 1466826 commit 4ddf049

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

extension/llm/export/quantizer_lib.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,7 @@ def get_qnn_quantizer(
192192
act_observer=MinMaxObserver,
193193
)
194194
elif quant_config == "16a4w":
195-
quant_dtype = QuantDtype.use_16a16w # pyre-fixme[16]
195+
quant_dtype = QuantDtype.use_16a4w # pyre-fixme[16]
196196
qnn_quantizer.set_default_quant_config(
197197
quant_dtype,
198198
is_qat=is_qat,

0 commit comments

Comments
 (0)