Allow different dtype for scales_and_zeros (#1923)

haodongucsb · web-flow · commit ab3792e3d91e · 2025-03-19T17:16:15.000-07:00
Allow different dtype for scales_and_zeros (#1923) Summary: Pull Request resolved: #1923 D59410096 tried to allow different dtype for scales and zeros but in https://www.internalfb.com/code/fbsource/fbcode/pytorch/ao/torchao/dtypes/uintx/tensor_core_tiled_layout.py?lines=262 where the `pack_tinygemm_scales_and_zeros` is getting called, there is no dtype input. As a result, if the dtype of scales and zeros are not `torch.bfloat16`, it will result in an error in `guard_dtype_size`. This diff is to set the dtype as the same as scales and zeros to avoid this issue. Reviewed By: andrewor14 Differential Revision: D71079504
diff --git a/torchao/dtypes/uintx/tensor_core_tiled_layout.py b/torchao/dtypes/uintx/tensor_core_tiled_layout.py
@@ -264,7 +264,7 @@ def from_plain(
         zero_point = zero_point.reshape(int_data.shape[0], -1)
         from torchao.quantization.utils import pack_tinygemm_scales_and_zeros
 
-        scale_and_zero = pack_tinygemm_scales_and_zeros(scale, zero_point)
+        scale_and_zero = pack_tinygemm_scales_and_zeros(scale, zero_point, scale.dtype)
         return cls(packed_weight, scale_and_zero, False, _layout)
 
     def to(self, *args, **kwargs):