[ET-VK] Enable IntxWeightOnlyConfig #13466

pytorchbot · 2025-08-16T01:14:34Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #13452 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/288/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/288/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/287/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/288/orig
@diff-train-skip-merge

pytorch-bot · 2025-08-16T01:14:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13466

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Pending, 1 Unrelated Failure

As of commit c2f9db2 with merge base cf669e3 ():

NEW FAILURE - The following job has failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-binary-size-linux-gcc / linux-job (gh) (trunk failure)
/pytorch/executorch/kernels/portable/cpu/op_stack.cpp:129:26: error: comparison of integer expressions of different signedness: ‘size_t’ {aka ‘long unsigned int’} and ‘ssize_t’ {aka ‘long int’} [-Werror=sign-compare]

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Pull Request resolved: #13452 ## Motivation Be able to test Vulkan lowering via optimum-executorch. ## Context Very similar to the below PR, Int4 weight only quantization is currently enabled in Vulkan via a custom source transform quantizer that replaces linear layers with a custom linear layer that calls a custom weight only quantized linear op. This diff aims to make it so that no Vulkan specific source transforms need to be applied by adding a fusion pattern for weight only quantized linear. ## Changes * Introduce a fusable graph pattern for weight only quantized linear * Add fusion logic for weight only quantized linear in the fuse patterns pass * Add `4w` qmode to the export llama script ghstack-source-id: 303387280 Differential Revision: [D80293302](https://our.internmc.facebook.com/intern/diff/D80293302/)

github-actions · 2025-08-16T13:25:31Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

## Motivation Be able to test Vulkan lowering via optimum-executorch. ## Context Very similar to the below PR, Int4 weight only quantization is currently enabled in Vulkan via a custom source transform quantizer that replaces linear layers with a custom linear layer that calls a custom weight only quantized linear op. This diff aims to make it so that no Vulkan specific source transforms need to be applied by adding a fusion pattern for weight only quantized linear. ## Changes * Introduce a fusable graph pattern for weight only quantized linear * Add fusion logic for weight only quantized linear in the fuse patterns pass * Add `4w` qmode to the export llama script Differential Revision: [D80293302](https://our.internmc.facebook.com/intern/diff/D80293302/) [ghstack-poisoned]

pytorchbot requested review from SS-JIA, jackzhxng, larryliu0820, lucylq, mergennachin and swolchok as code owners August 16, 2025 01:14

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 16, 2025

Base automatically changed from gh/SS-JIA/287/orig to main August 16, 2025 04:14

SS-JIA approved these changes Aug 16, 2025

View reviewed changes

SS-JIA force-pushed the gh/SS-JIA/288/orig branch from 5182515 to c2f9db2 Compare August 16, 2025 13:25

SS-JIA merged commit 2f4b704 into main Aug 16, 2025
102 of 104 checks passed

SS-JIA deleted the gh/SS-JIA/288/orig branch August 16, 2025 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ET-VK] Enable IntxWeightOnlyConfig #13466

[ET-VK] Enable IntxWeightOnlyConfig #13466

Uh oh!

pytorchbot commented Aug 16, 2025

Uh oh!

pytorch-bot bot commented Aug 16, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[ET-VK] Enable IntxWeightOnlyConfig #13466

[ET-VK] Enable IntxWeightOnlyConfig #13466

Uh oh!

Conversation

pytorchbot commented Aug 16, 2025

Uh oh!

pytorch-bot bot commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13466

❌ 1 New Failure, 1 Pending, 1 Unrelated Failure

Uh oh!

github-actions bot commented Aug 16, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Aug 16, 2025 •

edited

Loading

This PR needs a `release notes:` label