Replies: 1 comment 16 replies
-
Hi, yes, Triton supports int8. Could you please share the error that you see? Maybe I can help fixing it. |
Beta Was this translation helpful? Give feedback.
-
Hi, yes, Triton supports int8. Could you please share the error that you see? Maybe I can help fixing it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Is anyone doing quantization on x86 for fp32 to int8, I've been looking at some examples and I'm trying to do this for the test_matmul.py in the triton_shared/python/examples dir but am not able to do it. Does triton even support int8 at all? For example, if I modify the MLIR file after --triton-to-linalg-experimental by changing f32 to i8 I run into translation errors when trying to lower to affine loops. If I change float32 to int8 at the source level for test_matmul.py I run into errors about int8 not being supported in triton_shared: RuntimeError: "normal_kernel_cpu" not implemented for 'Char'
. I have seen this: https://pytorch.org/blog/int8-quantization/ but could not get this to work for test_matmul.py.
Beta Was this translation helpful? Give feedback.
All reactions