[CPU] Add support for dynamic float8 act float8 weight on CPU #2505

Xia-Weiwen · 2025-07-08T07:51:42Z

Summary
This PR adds support for dynamic float8 act float8 weight quantization on X86 CPU.
It adds

A new layout: Float8DynamicActFloat8WeightCPULayout
Two new ops: float8_linear_prepack_cpu and float8_linear_cpu
CPP kernels for the two new ops

The kernel computes FP8 GEMM with BF16 dtype.

Test plan

pytest test/quantization/test_dynamic_float8_linear_cpu.py

pytorch-bot · 2025-07-08T07:51:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2505

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 953ac13 with merge base 64c1ce3 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…ght on CPU

Xia-Weiwen · 2025-07-10T08:49:27Z

Hi @chunyuan-w @mingfeima Could you please review this PR? Thanks.

chunyuan-w · 2025-07-10T09:30:09Z

Should we move the conversion vec code to this file? https://github.com/pytorch/pytorch/blob/cd995bfb2aac8891465809be3ce29543bd524287/aten/src/ATen/cpu/vec/vec512/vec512_float8.h

Similar to this PR: pytorch/pytorch#152417

Xia-Weiwen · 2025-07-11T01:34:13Z

Should we move the conversion vec code to this file? https://github.com/pytorch/pytorch/blob/cd995bfb2aac8891465809be3ce29543bd524287/aten/src/ATen/cpu/vec/vec512/vec512_float8.h

Similar to this PR: pytorch/pytorch#152417

Thanks for the comment. If we move it to PyTorch, a problem might be that we need to check if the function is available at compile time. We may do it step by step, and for now it might be better that we keep it here.

chunyuan-w · 2025-07-11T08:26:03Z

torchao/csrc/cpu/float8_linear.cpp

+
+  // scales shape = [Nc, G, block_n]
+  int64_t num_groups = weight_scales.size(1);
+  int64_t group_size = K / num_groups;


Do we support the case where K % num_groups != 0?

We don't support it. It is guarded by the quantization utility in Torchao, such as

ao/torchao/quantization/quant_primitives.py

Line 293 in aee0795

assert input_size[i] % block_size[i] == 0, (

I have also added a TORCH_CHECK here. Thanks.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 8, 2025

Xia-Weiwen added topic: new feature Use this tag if this PR adds a new feature cpu labels Jul 8, 2025

[CPU] Add layout and implementation for dynamic float8 act float8 wei…

736e1f1

…ght on CPU

Xia-Weiwen added 3 commits July 10, 2025 14:45

Merge branch 'main' into float8_da8w8

5cc5bcc

Refine code

c238385

refine comments

3e7d179

chunyuan-w reviewed Jul 11, 2025

View reviewed changes

Xia-Weiwen requested a review from chunyuan-w July 11, 2025 10:10

Check K % num_groups == 0

953ac13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CPU] Add support for dynamic float8 act float8 weight on CPU #2505

[CPU] Add support for dynamic float8 act float8 weight on CPU #2505

Uh oh!

Xia-Weiwen commented Jul 8, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 8, 2025 •

edited

Loading

Uh oh!

Xia-Weiwen commented Jul 10, 2025

Uh oh!

chunyuan-w commented Jul 10, 2025

Uh oh!

Xia-Weiwen commented Jul 11, 2025

Uh oh!

chunyuan-w Jul 11, 2025

Uh oh!

Xia-Weiwen Jul 11, 2025

Uh oh!

Uh oh!

[CPU] Add support for dynamic float8 act float8 weight on CPU #2505

Are you sure you want to change the base?

[CPU] Add support for dynamic float8 act float8 weight on CPU #2505

Uh oh!

Conversation

Xia-Weiwen commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2505

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

Xia-Weiwen commented Jul 10, 2025

Uh oh!

chunyuan-w commented Jul 10, 2025

Uh oh!

Xia-Weiwen commented Jul 11, 2025

Uh oh!

chunyuan-w Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Xia-Weiwen commented Jul 8, 2025 •

edited

Loading

pytorch-bot bot commented Jul 8, 2025 •

edited

Loading